Problem Statement¶
Context¶
Businesses like banks which provide service have to worry about problem of 'Customer Churn' i.e. customers leaving and joining another service provider. It is important to understand which aspects of the service influence a customer's decision in this regard. Management can concentrate efforts on improvement of service, keeping in mind these priorities.
Objective¶
You as a Data scientist with the bank need to build a neural network based classifier that can determine whether a customer will leave the bank or not in the next 6 months.
Data Dictionary¶
CustomerId: Unique ID which is assigned to each customer
Surname: Last name of the customer
CreditScore: It defines the credit history of the customer.
Geography: A customer’s location
Gender: It defines the Gender of the customer
Age: Age of the customer
Tenure: Number of years for which the customer has been with the bank
NumOfProducts: refers to the number of products that a customer has purchased through the bank.
Balance: Account balance
HasCrCard: It is a categorical variable which decides whether the customer has credit card or not.
EstimatedSalary: Estimated salary
isActiveMember: Is is a categorical variable which decides whether the customer is active member of the bank or not ( Active member in the sense, using bank products regularly, making transactions etc )
Exited : whether or not the customer left the bank within six month. It can take two values
** 0=No ( Customer did not leave the bank ) ** 1=Yes ( Customer left the bank )
Importing necessary libraries¶
# Library for data manipulation and analysis.
import pandas as pd
# Fundamental package for scientific computing.
import numpy as np
#splitting datasets into training and testing sets.
from sklearn.model_selection import train_test_split
#Imports tools for data preprocessing including label encoding, one-hot encoding, and standard scaling
from sklearn.preprocessing import LabelEncoder, OneHotEncoder,StandardScaler
#Imports a class for imputing missing values in datasets.
from sklearn.impute import SimpleImputer
#Imports the Matplotlib library for creating visualizations.
import matplotlib.pyplot as plt
# Imports the Seaborn library for statistical data visualization.
import seaborn as sns
# Time related functions.
import time
#Imports functions for evaluating the performance of machine learning models
from sklearn.metrics import confusion_matrix, f1_score,accuracy_score, recall_score, precision_score, classification_report
#Imports Standard Scaler to transform numerical features to a common scale
from sklearn.preprocessing import StandardScaler
#Imports the tensorflow,keras and layers.
import tensorflow as tf
from tensorflow.keras.layers import Dense, Input, Dropout
from tensorflow.keras import backend
from tensorflow.keras.utils import set_random_seed
from tensorflow.keras.models import Sequential
#Import SMOTE to create synthetic samples
from imblearn.over_sampling import SMOTE
#Import warnings to suppress unnecessary warnings
# to suppress unnecessary warnings
import warnings
warnings.filterwarnings("ignore")
Loading the dataset¶
#Mount the google drive
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
#Reading the dataset.
data = pd.read_csv('/content/drive/My Drive/Bank Churn Project/bank-1.csv')
#Copy of the data set
df = data.copy()
Data Overview¶
Displaying the first few rows of the dataset¶
# Let's view the first 5 rows of the data
data.head()
| RowNumber | CustomerId | Surname | CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 15634602 | Hargrave | 619 | France | Female | 42 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 |
| 1 | 2 | 15647311 | Hill | 608 | Spain | Female | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 |
| 2 | 3 | 15619304 | Onio | 502 | France | Female | 42 | 8 | 159660.80 | 3 | 1 | 0 | 113931.57 | 1 |
| 3 | 4 | 15701354 | Boni | 699 | France | Female | 39 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 |
| 4 | 5 | 15737888 | Mitchell | 850 | Spain | Female | 43 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 |
Displaying the last few rows of the dataset¶
# Let's view the last 5 rows of the data
data.tail()
| RowNumber | CustomerId | Surname | CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9995 | 9996 | 15606229 | Obijiaku | 771 | France | Male | 39 | 5 | 0.00 | 2 | 1 | 0 | 96270.64 | 0 |
| 9996 | 9997 | 15569892 | Johnstone | 516 | France | Male | 35 | 10 | 57369.61 | 1 | 1 | 1 | 101699.77 | 0 |
| 9997 | 9998 | 15584532 | Liu | 709 | France | Female | 36 | 7 | 0.00 | 1 | 0 | 1 | 42085.58 | 1 |
| 9998 | 9999 | 15682355 | Sabbatini | 772 | Germany | Male | 42 | 3 | 75075.31 | 2 | 1 | 0 | 92888.52 | 1 |
| 9999 | 10000 | 15628319 | Walker | 792 | France | Female | 28 | 4 | 130142.79 | 1 | 1 | 0 | 38190.78 | 0 |
Checking the shape of the dataset¶
# Checking the number of rows and columns in the data
data.shape
(10000, 14)
- The dataset has 10000 rows and 14 columns
Checking the data types of the columns of the dataset¶
# Let's check the datatypes of the columns in the dataset
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10000 entries, 0 to 9999 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 RowNumber 10000 non-null int64 1 CustomerId 10000 non-null int64 2 Surname 10000 non-null object 3 CreditScore 10000 non-null int64 4 Geography 10000 non-null object 5 Gender 10000 non-null object 6 Age 10000 non-null int64 7 Tenure 10000 non-null int64 8 Balance 10000 non-null float64 9 NumOfProducts 10000 non-null int64 10 HasCrCard 10000 non-null int64 11 IsActiveMember 10000 non-null int64 12 EstimatedSalary 10000 non-null float64 13 Exited 10000 non-null int64 dtypes: float64(2), int64(9), object(3) memory usage: 1.1+ MB
- There are 10000 observations and 14 columns in the data.
- 11 columns are numerical and 3 columns are of the object datatype
- There are no missing values
Checking for duplicate values¶
# Let's check for duplicate values in the data
data.duplicated().sum()
0
- There are no duplicates
Checking for missing values¶
# Let's check for missing values in the data
round(data.isnull().sum() / data.isnull().count() * 100, 2)
| 0 | |
|---|---|
| RowNumber | 0.0 |
| CustomerId | 0.0 |
| Surname | 0.0 |
| CreditScore | 0.0 |
| Geography | 0.0 |
| Gender | 0.0 |
| Age | 0.0 |
| Tenure | 0.0 |
| Balance | 0.0 |
| NumOfProducts | 0.0 |
| HasCrCard | 0.0 |
| IsActiveMember | 0.0 |
| EstimatedSalary | 0.0 |
| Exited | 0.0 |
- There are no missing values
Statistical Summary¶
data.describe(include='all').T
| count | unique | top | freq | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| RowNumber | 10000.0 | NaN | NaN | NaN | 5000.5 | 2886.89568 | 1.0 | 2500.75 | 5000.5 | 7500.25 | 10000.0 |
| CustomerId | 10000.0 | NaN | NaN | NaN | 15690940.5694 | 71936.186123 | 15565701.0 | 15628528.25 | 15690738.0 | 15753233.75 | 15815690.0 |
| Surname | 10000 | 2932 | Smith | 32 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| CreditScore | 10000.0 | NaN | NaN | NaN | 650.5288 | 96.653299 | 350.0 | 584.0 | 652.0 | 718.0 | 850.0 |
| Geography | 10000 | 3 | France | 5014 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Gender | 10000 | 2 | Male | 5457 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Age | 10000.0 | NaN | NaN | NaN | 38.9218 | 10.487806 | 18.0 | 32.0 | 37.0 | 44.0 | 92.0 |
| Tenure | 10000.0 | NaN | NaN | NaN | 5.0128 | 2.892174 | 0.0 | 3.0 | 5.0 | 7.0 | 10.0 |
| Balance | 10000.0 | NaN | NaN | NaN | 76485.889288 | 62397.405202 | 0.0 | 0.0 | 97198.54 | 127644.24 | 250898.09 |
| NumOfProducts | 10000.0 | NaN | NaN | NaN | 1.5302 | 0.581654 | 1.0 | 1.0 | 1.0 | 2.0 | 4.0 |
| HasCrCard | 10000.0 | NaN | NaN | NaN | 0.7055 | 0.45584 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 |
| IsActiveMember | 10000.0 | NaN | NaN | NaN | 0.5151 | 0.499797 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 |
| EstimatedSalary | 10000.0 | NaN | NaN | NaN | 100090.239881 | 57510.492818 | 11.58 | 51002.11 | 100193.915 | 149388.2475 | 199992.48 |
| Exited | 10000.0 | NaN | NaN | NaN | 0.2037 | 0.402769 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
- Credit Score of customers range from 350 to 850 ,75% of them have 718 score
- Age of customers range from 18 to 92 ,50% of them above 37yrs
- Customers have been with the bank for 0 to 10 years, with 50% of them less than 5yrs
- Balance of customers range from 0 to around 250K
- Customers uses 1-4 bank products.75% of them uses 2 products.
- Estimated Salary ranges from 11.58 to around 200K
Checking the unique values¶
# Let's check the number of unique values in each column
data.nunique()
| 0 | |
|---|---|
| RowNumber | 10000 |
| CustomerId | 10000 |
| Surname | 2932 |
| CreditScore | 460 |
| Geography | 3 |
| Gender | 2 |
| Age | 70 |
| Tenure | 11 |
| Balance | 6382 |
| NumOfProducts | 4 |
| HasCrCard | 2 |
| IsActiveMember | 2 |
| EstimatedSalary | 9999 |
| Exited | 2 |
- Each value of the column 'CustomerId' is a unique identifier for a customer. Hence we can drop this column as it will not add any predictive power or value to the model.
- RowNumber column can also be dropped that is a unique identifier
- Surname does not add any value to the model and we will drop this column
- The 'Geography' column has 3 unique categories.
- The 'Tenure' column has 11 unique values where customers have been with the bank for 0-10 years.
#Display the unique values of each categorical columns
for i in data.describe(include=["object"]).columns:
print("Unique values in", i, "are :")
print(data[i].value_counts())
print("*" * 50)
Unique values in Surname are :
Surname
Smith 32
Scott 29
Martin 29
Walker 28
Brown 26
..
Izmailov 1
Bold 1
Bonham 1
Poninski 1
Burbidge 1
Name: count, Length: 2932, dtype: int64
**************************************************
Unique values in Geography are :
Geography
France 5014
Germany 2509
Spain 2477
Name: count, dtype: int64
**************************************************
Unique values in Gender are :
Gender
Male 5457
Female 4543
Name: count, dtype: int64
**************************************************
- There are 2932 unique surnames. This column will be dropped as it will not add value to the model
- The 3 geography values are France,Germany and Spain. 50% of the customers are from France and the remaining 50% are from Germany and Spain.
- There are more male customers than females.
Target variable "Exited" Value counts¶
#value counts of target variable "Exited"
data["Exited"].value_counts()
| count | |
|---|---|
| Exited | |
| 0 | 7963 |
| 1 | 2037 |
#Proportion of target data distribution
data['Exited'].value_counts(normalize=True)
| proportion | |
|---|---|
| Exited | |
| 0 | 0.7963 |
| 1 | 0.2037 |
- Around 80% of data in class 0 (Customer did not leave the bank in the last 6 months)
- 20% of data in class 1 (Customer left the bank in the last 6 months)
Exploratory Data Analysis¶
Below functions are used for EDA¶
# Function to create labeled barplots
def labeled_barplot(data, feature, perc=False, n=None):
"""
Barplot with percentage at the top
data: dataframe
feature: dataframe column
perc: whether to display percentages instead of count (default is False)
n: displays the top n category levels (default is None, i.e., display all levels)
"""
total = len(data[feature]) # length of the column
count = data[feature].nunique()
if n is None:
plt.figure(figsize=(count + 1, 5))
else:
plt.figure(figsize=(n + 1, 5))
plt.xticks(rotation=90, fontsize=15)
ax = sns.countplot(
data=data,
x=feature,
palette="Paired",
order=data[feature].value_counts().index[:n].sort_values(),
)
for p in ax.patches:
if perc == True:
label = "{:.1f}%".format(
100 * p.get_height() / total
) # percentage of each class of the category
else:
label = p.get_height() # count of each level of the category
x = p.get_x() + p.get_width() / 2 # width of the plot
y = p.get_height() # height of the plot
ax.annotate(
label,
(x, y),
ha="center",
va="center",
size=12,
xytext=(0, 5),
textcoords="offset points",
) # annotate the percentage
plt.show() # show the plot
# Function to plot a boxplot and a histogram along the same scale.
def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
"""
Boxplot and histogram combined
data: dataframe
feature: dataframe column
figsize: size of figure (default (12,7))
kde: whether to the show density curve (default False)
bins: number of bins for histogram (default None)
"""
f2, (ax_box2, ax_hist2) = plt.subplots(
nrows=2, # Number of rows of the subplot grid= 2
sharex=True, # x-axis will be shared among all subplots
gridspec_kw={"height_ratios": (0.25, 0.75)},
figsize=figsize,
) # creating the 2 subplots
sns.boxplot(
data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
) # boxplot will be created and a star will indicate the mean value of the column
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
) if bins else sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2
) # For histogram
ax_hist2.axvline(
data[feature].mean(), color="green", linestyle="--"
) # Add mean to the histogram
ax_hist2.axvline(
data[feature].median(), color="black", linestyle="-"
) # Add median to the histogram
### Function to plot distributions
def distribution_plot_wrt_target(data, predictor, target):
fig, axs = plt.subplots(2, 2, figsize=(12, 10))
target_uniq = data[target].unique()
axs[0, 0].set_title("Distribution of target for target=" + str(target_uniq[0]))
sns.histplot(
data=data[data[target] == target_uniq[0]],
x=predictor,
kde=True,
ax=axs[0, 0],
color="teal",
)
axs[0, 1].set_title("Distribution of target for target=" + str(target_uniq[1]))
sns.histplot(
data=data[data[target] == target_uniq[1]],
x=predictor,
kde=True,
ax=axs[0, 1],
color="orange",
)
axs[1, 0].set_title("Boxplot w.r.t target")
sns.boxplot(data=data, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow")
axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
sns.boxplot(
data=data,
x=target,
y=predictor,
ax=axs[1, 1],
showfliers=False,
palette="gist_rainbow",
)
plt.tight_layout()
plt.show()
# function to plot stacked bar chart
def stacked_barplot(data, predictor, target):
"""
Print the category counts and plot a stacked bar chart
data: dataframe
predictor: independent variable
target: target variable
"""
count = data[predictor].nunique()
sorter = data[target].value_counts().index[-1]
tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
by=sorter, ascending=False
)
print(tab1)
print("-" * 120)
tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
by=sorter, ascending=False
)
tab.plot(kind="bar", stacked=True, figsize=(count + 1, 5))
plt.legend(
loc="lower left",
frameon=False,
)
plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
plt.show()
Feature Engineering¶
Removing Features¶
- CustomerId and RowNumber with unique values that does not add value to the model
- Surname also does not add any value to the model. We will remove.
# Removing features Customer Id , RowNumber and Surname
data.drop(["CustomerId", "RowNumber","Surname"], axis=1, inplace=True)
Univariate Analysis¶
Numerical Column Analysis¶
# defining the list of numerical features to plot
features = data.select_dtypes(include=['number']).columns.tolist()
print("Numerical Features:")
print(features)
# creating the histograms for the numerical features selected above
for i, feature in enumerate(features):
histogram_boxplot(data, feature)
Numerical Features: ['CreditScore', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited']
# defining the figure size
plt.figure(figsize=(15, 10))
print("Numerical Features:")
print(features)
# creating the histograms
for i, feature in enumerate(features):
plt.subplot(4, 3, i+1) # assign a subplot in the main plot
sns.histplot(data=data, x=feature,kde='True') # plot the histogram
plt.tight_layout() # to add spacing between plots
plt.show()
Numerical Features: ['CreditScore', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited']
- Credit score is left skewed distribution
- Age is right skewed distribution
- Balance is left skewed.
- EstimatedSalary is close to symmetrical distribution
- There are outliers in credit score and Age . But we will not treat them as they seem to be valid.
Categorical Columns Analysis¶
Gender¶
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10000 entries, 0 to 9999 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 CreditScore 10000 non-null int64 1 Geography 10000 non-null object 2 Gender 10000 non-null object 3 Age 10000 non-null int64 4 Tenure 10000 non-null int64 5 Balance 10000 non-null float64 6 NumOfProducts 10000 non-null int64 7 HasCrCard 10000 non-null int64 8 IsActiveMember 10000 non-null int64 9 EstimatedSalary 10000 non-null float64 10 Exited 10000 non-null int64 dtypes: float64(2), int64(7), object(2) memory usage: 859.5+ KB
labeled_barplot(data, "Gender",perc=True)
- There are more male customers than female
- 54.6% are males and remaining are females
Geography¶
labeled_barplot(data, "Geography",perc=True)
- There are more customers from France than Germany and Spain.
- 50.1% customers are from France, 25.1% from Germany and 24.8% from Spain
HasCreditCard¶
labeled_barplot(data, "HasCrCard",perc=True)
- 70.5% of customers has credit card.
- 29.4% of customers does not have credit card
IsActiveMember¶
labeled_barplot(data, "IsActiveMember",perc=True)
- 70.5% of customers has credit card.
- 29.4% of customers does not have credit card
Bivariate Analysis¶
CreditScore vs Exited¶
distribution_plot_wrt_target(data, "CreditScore", "Exited")
- Highest number of churns are found with customers in 600-700 credit score.
- There are outliers in the credit score data of customers who exited but we will not treat them
Age vs Exited¶
distribution_plot_wrt_target(data, "Age", "Exited")
- Customers aged 40-50 are more likely to exit than other less aged customers
Tenure vs Exited¶
distribution_plot_wrt_target(data, "Tenure", "Exited")
- 25% of customers with less than 2 years tenure are more likely to exit than longer tenure customers.
- 50% of the csutomers who exited were less than 5 years with the bank.
Balance vs Exited¶
distribution_plot_wrt_target(data, "Balance", "Exited")
- Customers with no balance are more likely to exit that customers having some balance.
NumofProducts vs Exited¶
distribution_plot_wrt_target(data, "NumOfProducts", "Exited")
- Customer using more number of bank products is less likely to churn
- Customers having 1 product is more likely to churn than others.
EstimatedSalary vs Exited¶
distribution_plot_wrt_target(data, "EstimatedSalary", "Exited")
- 50% of the customers who exited had EstimatedSalary less than 100000
HasCrCard vs Exited¶
stacked_barplot(data, "HasCrCard", "Exited")
Exited 0 1 All HasCrCard All 7963 2037 10000 1 5631 1424 7055 0 2332 613 2945 ------------------------------------------------------------------------------------------------------------------------
close to 20% of the customers with or without credit card are more likely to churn
churn segment in both the groups are roughly similar in size and hence credit card ownership may not be strong predictor of churn in this dataset
IsActiveMember vs Exited¶
stacked_barplot(data, "IsActiveMember", "Exited")
Exited 0 1 All IsActiveMember All 7963 2037 10000 0 3547 1302 4849 1 4416 735 5151 ------------------------------------------------------------------------------------------------------------------------
- ActiveMember customers are less likely to churn
Gender vs Exited¶
stacked_barplot(data, "Gender", "Exited")
Exited 0 1 All Gender All 7963 2037 10000 Female 3404 1139 4543 Male 4559 898 5457 ------------------------------------------------------------------------------------------------------------------------
- Female Bank customers are more likely to churn than male customers.
Geography vs Exited¶
stacked_barplot(data, "Geography", "Exited")
Exited 0 1 All Geography All 7963 2037 10000 Germany 1695 814 2509 France 4204 810 5014 Spain 2064 413 2477 ------------------------------------------------------------------------------------------------------------------------
- Germany has more customer exits than France and Spain
HeatMap¶
# defining the figure size
plt.figure(figsize=(10, 7))
# plotting the correlation heatmap
sns.heatmap(data.corr(numeric_only = True), annot=True, fmt='0.2f', cmap='coolwarm');
- Age and Balance are positively correlated with the target variable Exited.
- Age has a higher positive correlation than Balance.
- CreditScore,Tenure,NumberofProducts,HasCrCard , are negatively correlated with the target variable.
- Balance and NumOfProducts are strongly negative correlated
- isActiveMember and Exited has higher negative correlation
Scatter Plot¶
# CreditScore vs Balance
plt.figure(figsize=(15,8))
sns.scatterplot(data=data,
x='CreditScore',
y='Balance',
hue='Exited',
);
# Age vs Balance
plt.figure(figsize=(15,8))
sns.scatterplot(data=data,
x='Age',
y='Balance',
hue='Exited',
);
Pair Plot¶
# Pair plot matrix
plt.figure(figsize=(100, 100))
sns.pairplot(data,corner=True,hue="Exited")
plt.show();
<Figure size 10000x10000 with 0 Axes>
Data Preprocessing¶
Encoding the categorical variables¶
Dummy Variable Creation
# Encoding the categorical variables using one-hot encoding
data = pd.get_dummies(
data,
columns=["Geography","Gender"],
drop_first=True,dtype=int
)
data.head()
| CreditScore | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | Geography_Germany | Geography_Spain | Gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 619 | 42 | 2 | 0.00 | 1 | 1 | 1 | 101348.88 | 1 | 0 | 0 | 0 |
| 1 | 608 | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 | 0 | 1 | 0 |
| 2 | 502 | 42 | 8 | 159660.80 | 3 | 1 | 0 | 113931.57 | 1 | 0 | 0 | 0 |
| 3 | 699 | 39 | 1 | 0.00 | 2 | 0 | 0 | 93826.63 | 0 | 0 | 0 | 0 |
| 4 | 850 | 43 | 2 | 125510.82 | 1 | 1 | 1 | 79084.10 | 0 | 0 | 1 | 0 |
Train-validation-test Split¶
X = data.drop("Exited", axis=1)
y = data["Exited"]
# Splitting data into training, validation and test sets:
# first we split data into 2 parts, say temporary and test
X_temp, X_test, y_temp, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# Splitting the Train dataset into the Training and Validation set.
X_train, X_val, y_train, y_val = train_test_split(X_temp, y_temp, test_size = 0.2, random_state = 42,stratify = y_temp)
#Printing the shapes.
print(X_train.shape,y_train.shape)
print(X_val.shape,y_val.shape)
print(X_test.shape,y_test.shape)
(6400, 11) (6400,) (1600, 11) (1600,) (2000, 11) (2000,)
- X_train has 6400 records with 11 columns , X_valid has 1600 records with 11 columns, and X_test has 2000 records with 11 columns
Data Normalization¶
X_train.head()
| CreditScore | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Geography_Germany | Geography_Spain | Gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 3006 | 747 | 45 | 1 | 114959.12 | 1 | 1 | 0 | 189362.39 | 0 | 0 | 0 |
| 5735 | 585 | 44 | 4 | 0.00 | 2 | 0 | 1 | 101728.46 | 0 | 1 | 0 |
| 5652 | 691 | 29 | 9 | 0.00 | 2 | 0 | 0 | 199635.93 | 0 | 0 | 0 |
| 26 | 756 | 36 | 2 | 136815.64 | 1 | 1 | 1 | 170041.95 | 1 | 0 | 1 |
| 3596 | 643 | 36 | 0 | 148159.71 | 1 | 0 | 0 | 55835.66 | 0 | 0 | 1 |
#Transform numerical features to a common scale
# defining the list of numerical features to normalize
num_features = ['CreditScore','Age','Tenure','Balance','NumOfProducts','EstimatedSalary']
scaler = StandardScaler()
X_train[num_features] = scaler.fit_transform(X_train[num_features])
X_val[num_features] = scaler.transform(X_val[num_features])
X_test[num_features] = scaler.transform(X_test[num_features])
X_train.head()
| CreditScore | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Geography_Germany | Geography_Spain | Gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 3006 | 0.988678 | 0.572685 | -1.384980 | 0.610704 | -0.909878 | 1 | 0 | 1.566070 | 0 | 0 | 0 |
| 5735 | -0.683985 | 0.477619 | -0.355056 | -1.237347 | 0.804346 | 0 | 1 | 0.036348 | 0 | 1 | 0 |
| 5652 | 0.410474 | -0.948376 | 1.361485 | -1.237347 | 0.804346 | 0 | 0 | 1.745403 | 0 | 0 | 0 |
| 26 | 1.081604 | -0.282911 | -1.041672 | 0.962063 | -0.909878 | 1 | 1 | 1.228816 | 1 | 0 | 1 |
| 3596 | -0.085130 | -0.282911 | -1.728288 | 1.144427 | -0.909878 | 0 | 0 | -0.764748 | 0 | 0 | 1 |
- Numerical data is normalized(standardized) using z-score normalization.Categorical features like gender,geography is not normalized.
- Normalization Centers the numerical features around zero mean and unit standard deviation
Model Building¶
Utlility Functions¶
def plot(history, name):
"""
Function to plot loss/accuracy
history: an object which stores the metrics and losses.
name: can be one of Loss or Accuracy
"""
fig, ax = plt.subplots() #Creating a subplot with figure and axes.
plt.plot(history.history[name]) #Plotting the train accuracy or train loss
plt.plot(history.history['val_'+name]) #Plotting the validation accuracy or validation loss
plt.title('Model ' + name.capitalize()) #Defining the title of the plot.
plt.ylabel(name.capitalize()) #Capitalizing the first letter.
plt.xlabel('Epoch') #Defining the label for the x-axis.
fig.legend(['Train', 'Validation'], loc="outside right upper") #Defining the legend, loc controls the position of the legend.
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification(
model, predictors, target, threshold=0.5
):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
The choice of the threshold depends on the specific business problem and the desired balance between different types of errors:
Higher Threshold (e.g., 0.7): This leads to higher precision (fewer false positives) but lower recall (more false negatives).
It's suitable when the cost of false positives is high, like in spam detection.
Lower Threshold (e.g., 0.3): This leads to higher recall (fewer false negatives) but lower precision (more false positives).
It's suitable when the cost of false negatives is high, like in disease diagnosis.
Using threshold of 0.5 in this case
"""
# checking which probabilities are greater than threshold
pred = model.predict(predictors) > threshold
# pred_temp = model.predict(predictors) > threshold
# # rounding off the above values to get classes
# pred = np.round(pred_temp)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred, average='weighted') # to compute Recall
precision = precision_score(target, pred, average='weighted') # to compute Precision
f1 = f1_score(target, pred, average='weighted') # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,},
index=[0],
)
return df_perf
Model Evaluation Criterion¶
Write down the logic for choosing the metric that would be the best metric for this business scenario.
Model can make wrong predictions as:¶
- Predicting a customer will leave but the customer will not exit.
- Predicting a customer will not leave and the customer exits.
Which case is more important?¶
- Predicting that the customer will not exits but in reality customer exits is a huge loss to the bank
How to reduce this loss i.e need to reduce False Negative?¶
- Bank would want
Recallto be maximized, greater the Recall higher the chances of minimizing false Negative. Hence, the focus should be on increasing Recall or minimizing the false Negatives or in other words identifying the True Positive(i.e. Class 1) so that the Bank can identify the real customer who will leave the Bank.
Fixing the seed for random number generators¶
First, let's set the seed for random number generators in NumPy, Python, and TensorFlow to be able to reproduce the same results everytime we run the code.
# Fixing the seed for random number generators
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
Calculate Class Weights¶
#Proportion of target data distribution
data['Exited'].value_counts(normalize=True)
| proportion | |
|---|---|
| Exited | |
| 0 | 0.7963 |
| 1 | 0.2037 |
As we have are dealing with an imbalance in class distribution, we will be using class weights to allow the model to give proportionally more importance to the minority class.
while an 80:20 class ratio might not be a severe imbalance, we will use class weights to check if it improve the model's performance on the minority class and prevent bias
# Calculate class weights for imbalanced dataset
cw = (y_train.shape[0]) / np.bincount(y_train)
# Create a dictionary mapping class indices to their respective class weights
cw_dict = {}
for i in range(cw.shape[0]):
cw_dict[i] = cw[i]
cw_dict
{0: 1.2558869701726845, 1: 4.9079754601226995}
Neural Network with SGD Optimizer¶
Model0: Neural Network of 14, 7 and 1 layers without class weights¶
- Let's start with a neural network consisting of
- two hidden layers with 14 and 7 neurons respectively
- activation function of ReLU.
- SGD as the optimizer
- Class Weights not used
- epochs=25
- batchsize = 64
# defining the batch size and # epochs
epochs = 25
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7,1
model0 = Sequential()
#First Hidden Layer
model0.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
#Second Hidden Layer
model0.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model0.add(Dense(1,activation="sigmoid"))
model0.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer to be used
model0.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history0 = model0.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),batch_size=batch_size,verbose=2)
end=time.time()
Epoch 1/25 100/100 - 2s - 22ms/step - accuracy: 0.7902 - loss: 0.5177 - val_accuracy: 0.7937 - val_loss: 0.4865 Epoch 2/25 100/100 - 1s - 13ms/step - accuracy: 0.7964 - loss: 0.4693 - val_accuracy: 0.7937 - val_loss: 0.4723 Epoch 3/25 100/100 - 1s - 5ms/step - accuracy: 0.7966 - loss: 0.4587 - val_accuracy: 0.7937 - val_loss: 0.4649 Epoch 4/25 100/100 - 0s - 4ms/step - accuracy: 0.7962 - loss: 0.4518 - val_accuracy: 0.7944 - val_loss: 0.4594 Epoch 5/25 100/100 - 0s - 3ms/step - accuracy: 0.7987 - loss: 0.4465 - val_accuracy: 0.7956 - val_loss: 0.4549 Epoch 6/25 100/100 - 1s - 6ms/step - accuracy: 0.8005 - loss: 0.4421 - val_accuracy: 0.7956 - val_loss: 0.4512 Epoch 7/25 100/100 - 1s - 7ms/step - accuracy: 0.8025 - loss: 0.4384 - val_accuracy: 0.7969 - val_loss: 0.4481 Epoch 8/25 100/100 - 0s - 4ms/step - accuracy: 0.8055 - loss: 0.4352 - val_accuracy: 0.7987 - val_loss: 0.4454 Epoch 9/25 100/100 - 1s - 5ms/step - accuracy: 0.8084 - loss: 0.4324 - val_accuracy: 0.7981 - val_loss: 0.4429 Epoch 10/25 100/100 - 0s - 3ms/step - accuracy: 0.8094 - loss: 0.4297 - val_accuracy: 0.8006 - val_loss: 0.4406 Epoch 11/25 100/100 - 0s - 4ms/step - accuracy: 0.8116 - loss: 0.4273 - val_accuracy: 0.8025 - val_loss: 0.4383 Epoch 12/25 100/100 - 0s - 4ms/step - accuracy: 0.8122 - loss: 0.4250 - val_accuracy: 0.8019 - val_loss: 0.4361 Epoch 13/25 100/100 - 1s - 6ms/step - accuracy: 0.8141 - loss: 0.4228 - val_accuracy: 0.8044 - val_loss: 0.4340 Epoch 14/25 100/100 - 0s - 3ms/step - accuracy: 0.8158 - loss: 0.4206 - val_accuracy: 0.8062 - val_loss: 0.4319 Epoch 15/25 100/100 - 0s - 4ms/step - accuracy: 0.8184 - loss: 0.4185 - val_accuracy: 0.8075 - val_loss: 0.4298 Epoch 16/25 100/100 - 1s - 5ms/step - accuracy: 0.8197 - loss: 0.4164 - val_accuracy: 0.8087 - val_loss: 0.4278 Epoch 17/25 100/100 - 1s - 6ms/step - accuracy: 0.8213 - loss: 0.4143 - val_accuracy: 0.8112 - val_loss: 0.4258 Epoch 18/25 100/100 - 1s - 8ms/step - accuracy: 0.8228 - loss: 0.4122 - val_accuracy: 0.8150 - val_loss: 0.4238 Epoch 19/25 100/100 - 1s - 6ms/step - accuracy: 0.8244 - loss: 0.4101 - val_accuracy: 0.8169 - val_loss: 0.4218 Epoch 20/25 100/100 - 0s - 5ms/step - accuracy: 0.8259 - loss: 0.4080 - val_accuracy: 0.8188 - val_loss: 0.4197 Epoch 21/25 100/100 - 1s - 6ms/step - accuracy: 0.8277 - loss: 0.4059 - val_accuracy: 0.8200 - val_loss: 0.4177 Epoch 22/25 100/100 - 0s - 5ms/step - accuracy: 0.8294 - loss: 0.4038 - val_accuracy: 0.8219 - val_loss: 0.4157 Epoch 23/25 100/100 - 0s - 4ms/step - accuracy: 0.8298 - loss: 0.4017 - val_accuracy: 0.8238 - val_loss: 0.4136 Epoch 24/25 100/100 - 0s - 3ms/step - accuracy: 0.8314 - loss: 0.3995 - val_accuracy: 0.8244 - val_loss: 0.4116 Epoch 25/25 100/100 - 0s - 3ms/step - accuracy: 0.8319 - loss: 0.3973 - val_accuracy: 0.8250 - val_loss: 0.4095
print("Time taken in seconds ",end-start)
Time taken in seconds 15.092943668365479
plot(history0,'loss')
plot(history0,'accuracy')
Observations:
- The training accuracy is increasing with the increase in the epochs
- Validation accuracy is increasing with epochs
model0_train_perf = model_performance_classification(model0,X_train,y_train)
model0_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.833281 | 0.833281 | 0.821079 | 0.801198 |
model0_val_perf = model_performance_classification(model0,X_val,y_val)
model0_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.825 | 0.825 | 0.806742 | 0.790613 |
Observations:
- Model 0 training recall performance is around 83.3% and validation performance is also around 82.5%
- Number of parameters is 281
- Train and validation recall performance are close and model generalises well on unseen data.
But Let's try couple more changes to the model and analyse further
Model1: Neural Network of 14,7 and 1 layers without class weights and with momentum¶
Build Model with
- two hidden layers with 14 and 7 neurons respectively
- activation function of ReLU.
- SGD as the optimizer
- Class Weights not used
- Momentum used
- epochs = 25
- batchsize = 64
# defining the batch size and # epochs
epochs = 25
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7 and 1
model1 = Sequential()
#First Hidden Layer
model1.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
#Second Hidden Layer
model1.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model1.add(Dense(1,activation="sigmoid"))
model1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.SGD(momentum=0.9) # defining SGD as the optimizer with momentum to be used
model1.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history1 = model1.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),batch_size=batch_size,verbose=2)
end = time.time()
Epoch 1/25 100/100 - 2s - 15ms/step - accuracy: 0.7477 - loss: 0.5449 - val_accuracy: 0.7962 - val_loss: 0.4846 Epoch 2/25 100/100 - 0s - 4ms/step - accuracy: 0.7962 - loss: 0.4708 - val_accuracy: 0.7956 - val_loss: 0.4486 Epoch 3/25 100/100 - 1s - 6ms/step - accuracy: 0.7966 - loss: 0.4382 - val_accuracy: 0.8025 - val_loss: 0.4266 Epoch 4/25 100/100 - 1s - 5ms/step - accuracy: 0.8008 - loss: 0.4228 - val_accuracy: 0.8081 - val_loss: 0.4157 Epoch 5/25 100/100 - 1s - 6ms/step - accuracy: 0.8087 - loss: 0.4125 - val_accuracy: 0.8213 - val_loss: 0.4050 Epoch 6/25 100/100 - 0s - 3ms/step - accuracy: 0.8141 - loss: 0.4034 - val_accuracy: 0.8281 - val_loss: 0.3946 Epoch 7/25 100/100 - 0s - 3ms/step - accuracy: 0.8205 - loss: 0.3918 - val_accuracy: 0.8388 - val_loss: 0.3797 Epoch 8/25 100/100 - 0s - 4ms/step - accuracy: 0.8372 - loss: 0.3765 - val_accuracy: 0.8500 - val_loss: 0.3683 Epoch 9/25 100/100 - 1s - 5ms/step - accuracy: 0.8469 - loss: 0.3665 - val_accuracy: 0.8544 - val_loss: 0.3615 Epoch 10/25 100/100 - 0s - 3ms/step - accuracy: 0.8495 - loss: 0.3604 - val_accuracy: 0.8556 - val_loss: 0.3570 Epoch 11/25 100/100 - 1s - 7ms/step - accuracy: 0.8519 - loss: 0.3562 - val_accuracy: 0.8575 - val_loss: 0.3537 Epoch 12/25 100/100 - 1s - 5ms/step - accuracy: 0.8537 - loss: 0.3529 - val_accuracy: 0.8569 - val_loss: 0.3514 Epoch 13/25 100/100 - 0s - 4ms/step - accuracy: 0.8556 - loss: 0.3503 - val_accuracy: 0.8600 - val_loss: 0.3495 Epoch 14/25 100/100 - 0s - 5ms/step - accuracy: 0.8577 - loss: 0.3482 - val_accuracy: 0.8619 - val_loss: 0.3481 Epoch 15/25 100/100 - 1s - 6ms/step - accuracy: 0.8580 - loss: 0.3464 - val_accuracy: 0.8619 - val_loss: 0.3469 Epoch 16/25 100/100 - 0s - 5ms/step - accuracy: 0.8592 - loss: 0.3449 - val_accuracy: 0.8619 - val_loss: 0.3462 Epoch 17/25 100/100 - 1s - 6ms/step - accuracy: 0.8584 - loss: 0.3435 - val_accuracy: 0.8606 - val_loss: 0.3455 Epoch 18/25 100/100 - 0s - 4ms/step - accuracy: 0.8595 - loss: 0.3423 - val_accuracy: 0.8581 - val_loss: 0.3448 Epoch 19/25 100/100 - 0s - 4ms/step - accuracy: 0.8587 - loss: 0.3412 - val_accuracy: 0.8606 - val_loss: 0.3441 Epoch 20/25 100/100 - 1s - 6ms/step - accuracy: 0.8592 - loss: 0.3403 - val_accuracy: 0.8575 - val_loss: 0.3436 Epoch 21/25 100/100 - 1s - 6ms/step - accuracy: 0.8598 - loss: 0.3395 - val_accuracy: 0.8581 - val_loss: 0.3434 Epoch 22/25 100/100 - 0s - 3ms/step - accuracy: 0.8595 - loss: 0.3387 - val_accuracy: 0.8587 - val_loss: 0.3428 Epoch 23/25 100/100 - 1s - 6ms/step - accuracy: 0.8605 - loss: 0.3380 - val_accuracy: 0.8594 - val_loss: 0.3425 Epoch 24/25 100/100 - 0s - 3ms/step - accuracy: 0.8606 - loss: 0.3374 - val_accuracy: 0.8581 - val_loss: 0.3422 Epoch 25/25 100/100 - 1s - 6ms/step - accuracy: 0.8611 - loss: 0.3368 - val_accuracy: 0.8575 - val_loss: 0.3420
print("Time taken in seconds ",end-start)
Time taken in seconds 13.762194395065308
plot(history1,'loss')
plot(history1,'accuracy')
model1_train_perf = model_performance_classification(model1,X_train,y_train)
model1_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.862031 | 0.862031 | 0.852781 | 0.849874 |
model1_val_perf = model_performance_classification(model1,X_val,y_val)
model1_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.8575 | 0.8575 | 0.847249 | 0.844355 |
Observations:
- Model1 has training recall of 86.2% and validation recall of 85.75%
- Time taken to run this model is more around 13s-14s
- Number of parameters is 281
- Train and validation recall performance are close and model generalises well on unseen data. But Let's try couple more changes to the model and analyse furthers
- Model1 performace with momentum is better than Model0
Model2: Neural Network of 14,7 and 1 layers without class weights , with epochs=50¶
Build Model with
- two hidden layers with 14 and 7 neurons respectively
- activation function of ReLU.
- SGD as the optimizer
- Class Weights not used
- Momentum used
- epochs = 50
- batchsize = 64
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7 and 1
model2 = Sequential()
#First Hidden Layer
model2.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
#Second Hidden Layer
model2.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model2.add(Dense(1,activation="sigmoid"))
model2.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer with momentum to be used
model2.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history2 = model2.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),batch_size=batch_size,verbose=2)
end = time.time()
Epoch 1/50 100/100 - 1s - 12ms/step - accuracy: 0.5780 - loss: 0.7006 - val_accuracy: 0.7631 - val_loss: 0.6026 Epoch 2/50 100/100 - 1s - 7ms/step - accuracy: 0.7894 - loss: 0.5647 - val_accuracy: 0.7969 - val_loss: 0.5361 Epoch 3/50 100/100 - 0s - 3ms/step - accuracy: 0.7962 - loss: 0.5240 - val_accuracy: 0.7962 - val_loss: 0.5121 Epoch 4/50 100/100 - 1s - 7ms/step - accuracy: 0.7964 - loss: 0.5074 - val_accuracy: 0.7962 - val_loss: 0.5010 Epoch 5/50 100/100 - 1s - 6ms/step - accuracy: 0.7966 - loss: 0.4983 - val_accuracy: 0.7962 - val_loss: 0.4941 Epoch 6/50 100/100 - 0s - 4ms/step - accuracy: 0.7967 - loss: 0.4915 - val_accuracy: 0.7962 - val_loss: 0.4886 Epoch 7/50 100/100 - 1s - 6ms/step - accuracy: 0.7964 - loss: 0.4854 - val_accuracy: 0.7956 - val_loss: 0.4835 Epoch 8/50 100/100 - 1s - 6ms/step - accuracy: 0.7966 - loss: 0.4798 - val_accuracy: 0.7956 - val_loss: 0.4787 Epoch 9/50 100/100 - 0s - 5ms/step - accuracy: 0.7973 - loss: 0.4746 - val_accuracy: 0.7962 - val_loss: 0.4742 Epoch 10/50 100/100 - 0s - 5ms/step - accuracy: 0.7967 - loss: 0.4696 - val_accuracy: 0.7950 - val_loss: 0.4700 Epoch 11/50 100/100 - 1s - 6ms/step - accuracy: 0.7977 - loss: 0.4648 - val_accuracy: 0.7931 - val_loss: 0.4659 Epoch 12/50 100/100 - 1s - 5ms/step - accuracy: 0.7980 - loss: 0.4605 - val_accuracy: 0.7931 - val_loss: 0.4622 Epoch 13/50 100/100 - 0s - 3ms/step - accuracy: 0.7983 - loss: 0.4565 - val_accuracy: 0.7931 - val_loss: 0.4589 Epoch 14/50 100/100 - 1s - 6ms/step - accuracy: 0.7995 - loss: 0.4528 - val_accuracy: 0.7944 - val_loss: 0.4558 Epoch 15/50 100/100 - 0s - 3ms/step - accuracy: 0.7980 - loss: 0.4494 - val_accuracy: 0.7962 - val_loss: 0.4530 Epoch 16/50 100/100 - 0s - 4ms/step - accuracy: 0.7984 - loss: 0.4463 - val_accuracy: 0.7981 - val_loss: 0.4503 Epoch 17/50 100/100 - 1s - 5ms/step - accuracy: 0.8000 - loss: 0.4435 - val_accuracy: 0.8012 - val_loss: 0.4480 Epoch 18/50 100/100 - 0s - 3ms/step - accuracy: 0.8012 - loss: 0.4409 - val_accuracy: 0.8019 - val_loss: 0.4458 Epoch 19/50 100/100 - 0s - 3ms/step - accuracy: 0.8028 - loss: 0.4385 - val_accuracy: 0.8044 - val_loss: 0.4438 Epoch 20/50 100/100 - 0s - 3ms/step - accuracy: 0.8031 - loss: 0.4362 - val_accuracy: 0.8056 - val_loss: 0.4419 Epoch 21/50 100/100 - 1s - 7ms/step - accuracy: 0.8039 - loss: 0.4341 - val_accuracy: 0.8069 - val_loss: 0.4400 Epoch 22/50 100/100 - 1s - 6ms/step - accuracy: 0.8055 - loss: 0.4320 - val_accuracy: 0.8069 - val_loss: 0.4381 Epoch 23/50 100/100 - 0s - 3ms/step - accuracy: 0.8059 - loss: 0.4301 - val_accuracy: 0.8087 - val_loss: 0.4363 Epoch 24/50 100/100 - 0s - 3ms/step - accuracy: 0.8083 - loss: 0.4281 - val_accuracy: 0.8100 - val_loss: 0.4345 Epoch 25/50 100/100 - 0s - 3ms/step - accuracy: 0.8116 - loss: 0.4262 - val_accuracy: 0.8112 - val_loss: 0.4327 Epoch 26/50 100/100 - 1s - 6ms/step - accuracy: 0.8123 - loss: 0.4244 - val_accuracy: 0.8112 - val_loss: 0.4309 Epoch 27/50 100/100 - 1s - 7ms/step - accuracy: 0.8141 - loss: 0.4225 - val_accuracy: 0.8131 - val_loss: 0.4291 Epoch 28/50 100/100 - 0s - 4ms/step - accuracy: 0.8153 - loss: 0.4207 - val_accuracy: 0.8150 - val_loss: 0.4272 Epoch 29/50 100/100 - 1s - 6ms/step - accuracy: 0.8158 - loss: 0.4188 - val_accuracy: 0.8169 - val_loss: 0.4253 Epoch 30/50 100/100 - 1s - 5ms/step - accuracy: 0.8172 - loss: 0.4169 - val_accuracy: 0.8175 - val_loss: 0.4233 Epoch 31/50 100/100 - 1s - 6ms/step - accuracy: 0.8175 - loss: 0.4150 - val_accuracy: 0.8188 - val_loss: 0.4214 Epoch 32/50 100/100 - 0s - 3ms/step - accuracy: 0.8192 - loss: 0.4131 - val_accuracy: 0.8194 - val_loss: 0.4193 Epoch 33/50 100/100 - 0s - 4ms/step - accuracy: 0.8200 - loss: 0.4111 - val_accuracy: 0.8213 - val_loss: 0.4171 Epoch 34/50 100/100 - 1s - 7ms/step - accuracy: 0.8219 - loss: 0.4091 - val_accuracy: 0.8213 - val_loss: 0.4150 Epoch 35/50 100/100 - 0s - 4ms/step - accuracy: 0.8236 - loss: 0.4071 - val_accuracy: 0.8225 - val_loss: 0.4127 Epoch 36/50 100/100 - 0s - 5ms/step - accuracy: 0.8255 - loss: 0.4050 - val_accuracy: 0.8269 - val_loss: 0.4104 Epoch 37/50 100/100 - 0s - 4ms/step - accuracy: 0.8272 - loss: 0.4029 - val_accuracy: 0.8294 - val_loss: 0.4082 Epoch 38/50 100/100 - 0s - 5ms/step - accuracy: 0.8295 - loss: 0.4008 - val_accuracy: 0.8338 - val_loss: 0.4060 Epoch 39/50 100/100 - 1s - 5ms/step - accuracy: 0.8317 - loss: 0.3988 - val_accuracy: 0.8338 - val_loss: 0.4038 Epoch 40/50 100/100 - 0s - 4ms/step - accuracy: 0.8334 - loss: 0.3967 - val_accuracy: 0.8356 - val_loss: 0.4017 Epoch 41/50 100/100 - 1s - 5ms/step - accuracy: 0.8352 - loss: 0.3947 - val_accuracy: 0.8344 - val_loss: 0.3996 Epoch 42/50 100/100 - 1s - 6ms/step - accuracy: 0.8366 - loss: 0.3927 - val_accuracy: 0.8369 - val_loss: 0.3977 Epoch 43/50 100/100 - 0s - 4ms/step - accuracy: 0.8377 - loss: 0.3908 - val_accuracy: 0.8381 - val_loss: 0.3960 Epoch 44/50 100/100 - 0s - 4ms/step - accuracy: 0.8375 - loss: 0.3889 - val_accuracy: 0.8388 - val_loss: 0.3943 Epoch 45/50 100/100 - 1s - 6ms/step - accuracy: 0.8373 - loss: 0.3871 - val_accuracy: 0.8381 - val_loss: 0.3926 Epoch 46/50 100/100 - 0s - 3ms/step - accuracy: 0.8377 - loss: 0.3853 - val_accuracy: 0.8388 - val_loss: 0.3911 Epoch 47/50 100/100 - 0s - 4ms/step - accuracy: 0.8380 - loss: 0.3837 - val_accuracy: 0.8363 - val_loss: 0.3896 Epoch 48/50 100/100 - 1s - 5ms/step - accuracy: 0.8389 - loss: 0.3821 - val_accuracy: 0.8369 - val_loss: 0.3883 Epoch 49/50 100/100 - 0s - 3ms/step - accuracy: 0.8406 - loss: 0.3806 - val_accuracy: 0.8381 - val_loss: 0.3870 Epoch 50/50 100/100 - 0s - 3ms/step - accuracy: 0.8413 - loss: 0.3791 - val_accuracy: 0.8363 - val_loss: 0.3857
print("Time taken in seconds ",end-start)
Time taken in seconds 25.28948974609375
plot(history2,'loss')
plot(history2,'accuracy')
model2_train_perf = model_performance_classification(model2,X_train,y_train)
model2_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.841406 | 0.841406 | 0.827612 | 0.820257 |
model2_val_perf = model_performance_classification(model2,X_val,y_val)
model2_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.83625 | 0.83625 | 0.820166 | 0.814844 |
Observations:
- Model2 has training recall of 84.1% and validation recall of 83.6%
- Time taken to run this model is more around 23s - 28s
- Number of parameters is 281
- Train and validation recall performance are close and model generalises well on unseen data. But Let's try couple more changes to the model and analyse furthers
- Model 2 with epochs=50 has higher performance than Model0 with epochs=25
- We will use epochs=50 for the rest of the models
Model3: Neural Network of 14, 7 and 1 layers without Batch Size defined ,with class weights¶
Build Neural Network
- two hidden layers with 14 and 7 neurons respectively
- activation function of ReLU.
- SGD as the optimizer
- No batchsize
- Class Weights used
# defining the batch size and # epochs
epochs = 50
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7 and 1
model3 = Sequential()
#First Hidden Layer
model3.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
#Second Hidden Layer
model3.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model3.add(Dense(1,activation="sigmoid"))
model3.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer to be used
model3.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history3 = model3.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),class_weight=cw_dict,verbose=2)
end = time.time()
Epoch 1/50 200/200 - 2s - 9ms/step - accuracy: 0.6136 - loss: 1.4176 - val_accuracy: 0.6081 - val_loss: 0.6392 Epoch 2/50 200/200 - 1s - 3ms/step - accuracy: 0.6255 - loss: 1.3146 - val_accuracy: 0.6612 - val_loss: 0.6015 Epoch 3/50 200/200 - 1s - 3ms/step - accuracy: 0.6675 - loss: 1.2558 - val_accuracy: 0.6762 - val_loss: 0.5747 Epoch 4/50 200/200 - 1s - 4ms/step - accuracy: 0.6819 - loss: 1.2100 - val_accuracy: 0.6881 - val_loss: 0.5624 Epoch 5/50 200/200 - 1s - 7ms/step - accuracy: 0.6873 - loss: 1.1753 - val_accuracy: 0.6956 - val_loss: 0.5561 Epoch 6/50 200/200 - 1s - 5ms/step - accuracy: 0.6953 - loss: 1.1498 - val_accuracy: 0.7031 - val_loss: 0.5528 Epoch 7/50 200/200 - 1s - 3ms/step - accuracy: 0.7006 - loss: 1.1299 - val_accuracy: 0.7075 - val_loss: 0.5480 Epoch 8/50 200/200 - 1s - 3ms/step - accuracy: 0.7067 - loss: 1.1124 - val_accuracy: 0.7119 - val_loss: 0.5414 Epoch 9/50 200/200 - 1s - 3ms/step - accuracy: 0.7123 - loss: 1.0963 - val_accuracy: 0.7206 - val_loss: 0.5333 Epoch 10/50 200/200 - 1s - 3ms/step - accuracy: 0.7195 - loss: 1.0801 - val_accuracy: 0.7287 - val_loss: 0.5267 Epoch 11/50 200/200 - 1s - 3ms/step - accuracy: 0.7302 - loss: 1.0649 - val_accuracy: 0.7312 - val_loss: 0.5207 Epoch 12/50 200/200 - 1s - 3ms/step - accuracy: 0.7409 - loss: 1.0496 - val_accuracy: 0.7344 - val_loss: 0.5138 Epoch 13/50 200/200 - 1s - 3ms/step - accuracy: 0.7475 - loss: 1.0359 - val_accuracy: 0.7419 - val_loss: 0.5049 Epoch 14/50 200/200 - 1s - 3ms/step - accuracy: 0.7513 - loss: 1.0239 - val_accuracy: 0.7456 - val_loss: 0.4996 Epoch 15/50 200/200 - 1s - 3ms/step - accuracy: 0.7545 - loss: 1.0134 - val_accuracy: 0.7475 - val_loss: 0.4950 Epoch 16/50 200/200 - 1s - 3ms/step - accuracy: 0.7561 - loss: 1.0043 - val_accuracy: 0.7506 - val_loss: 0.4917 Epoch 17/50 200/200 - 1s - 3ms/step - accuracy: 0.7578 - loss: 0.9964 - val_accuracy: 0.7538 - val_loss: 0.4872 Epoch 18/50 200/200 - 1s - 3ms/step - accuracy: 0.7588 - loss: 0.9897 - val_accuracy: 0.7519 - val_loss: 0.4840 Epoch 19/50 200/200 - 1s - 3ms/step - accuracy: 0.7592 - loss: 0.9841 - val_accuracy: 0.7500 - val_loss: 0.4819 Epoch 20/50 200/200 - 1s - 3ms/step - accuracy: 0.7620 - loss: 0.9791 - val_accuracy: 0.7531 - val_loss: 0.4803 Epoch 21/50 200/200 - 1s - 3ms/step - accuracy: 0.7633 - loss: 0.9750 - val_accuracy: 0.7544 - val_loss: 0.4770 Epoch 22/50 200/200 - 1s - 3ms/step - accuracy: 0.7645 - loss: 0.9705 - val_accuracy: 0.7581 - val_loss: 0.4745 Epoch 23/50 200/200 - 1s - 5ms/step - accuracy: 0.7659 - loss: 0.9661 - val_accuracy: 0.7613 - val_loss: 0.4727 Epoch 24/50 200/200 - 1s - 6ms/step - accuracy: 0.7663 - loss: 0.9620 - val_accuracy: 0.7650 - val_loss: 0.4719 Epoch 25/50 200/200 - 1s - 5ms/step - accuracy: 0.7677 - loss: 0.9580 - val_accuracy: 0.7631 - val_loss: 0.4708 Epoch 26/50 200/200 - 1s - 5ms/step - accuracy: 0.7684 - loss: 0.9544 - val_accuracy: 0.7631 - val_loss: 0.4702 Epoch 27/50 200/200 - 1s - 3ms/step - accuracy: 0.7694 - loss: 0.9512 - val_accuracy: 0.7638 - val_loss: 0.4693 Epoch 28/50 200/200 - 1s - 3ms/step - accuracy: 0.7702 - loss: 0.9483 - val_accuracy: 0.7650 - val_loss: 0.4693 Epoch 29/50 200/200 - 1s - 3ms/step - accuracy: 0.7713 - loss: 0.9454 - val_accuracy: 0.7681 - val_loss: 0.4673 Epoch 30/50 200/200 - 1s - 3ms/step - accuracy: 0.7725 - loss: 0.9425 - val_accuracy: 0.7681 - val_loss: 0.4665 Epoch 31/50 200/200 - 1s - 3ms/step - accuracy: 0.7744 - loss: 0.9398 - val_accuracy: 0.7700 - val_loss: 0.4648 Epoch 32/50 200/200 - 1s - 3ms/step - accuracy: 0.7748 - loss: 0.9377 - val_accuracy: 0.7706 - val_loss: 0.4649 Epoch 33/50 200/200 - 1s - 3ms/step - accuracy: 0.7753 - loss: 0.9357 - val_accuracy: 0.7719 - val_loss: 0.4647 Epoch 34/50 200/200 - 1s - 3ms/step - accuracy: 0.7752 - loss: 0.9340 - val_accuracy: 0.7719 - val_loss: 0.4648 Epoch 35/50 200/200 - 1s - 3ms/step - accuracy: 0.7759 - loss: 0.9324 - val_accuracy: 0.7725 - val_loss: 0.4651 Epoch 36/50 200/200 - 1s - 3ms/step - accuracy: 0.7763 - loss: 0.9310 - val_accuracy: 0.7731 - val_loss: 0.4649 Epoch 37/50 200/200 - 1s - 3ms/step - accuracy: 0.7770 - loss: 0.9297 - val_accuracy: 0.7744 - val_loss: 0.4644 Epoch 38/50 200/200 - 1s - 3ms/step - accuracy: 0.7773 - loss: 0.9282 - val_accuracy: 0.7763 - val_loss: 0.4633 Epoch 39/50 200/200 - 1s - 3ms/step - accuracy: 0.7780 - loss: 0.9269 - val_accuracy: 0.7756 - val_loss: 0.4629 Epoch 40/50 200/200 - 1s - 3ms/step - accuracy: 0.7792 - loss: 0.9258 - val_accuracy: 0.7744 - val_loss: 0.4632 Epoch 41/50 200/200 - 1s - 3ms/step - accuracy: 0.7798 - loss: 0.9248 - val_accuracy: 0.7744 - val_loss: 0.4623 Epoch 42/50 200/200 - 1s - 5ms/step - accuracy: 0.7800 - loss: 0.9235 - val_accuracy: 0.7750 - val_loss: 0.4616 Epoch 43/50 200/200 - 1s - 6ms/step - accuracy: 0.7811 - loss: 0.9225 - val_accuracy: 0.7744 - val_loss: 0.4616 Epoch 44/50 200/200 - 1s - 4ms/step - accuracy: 0.7822 - loss: 0.9216 - val_accuracy: 0.7744 - val_loss: 0.4612 Epoch 45/50 200/200 - 1s - 3ms/step - accuracy: 0.7831 - loss: 0.9206 - val_accuracy: 0.7744 - val_loss: 0.4602 Epoch 46/50 200/200 - 1s - 3ms/step - accuracy: 0.7836 - loss: 0.9197 - val_accuracy: 0.7756 - val_loss: 0.4594 Epoch 47/50 200/200 - 1s - 3ms/step - accuracy: 0.7837 - loss: 0.9188 - val_accuracy: 0.7738 - val_loss: 0.4599 Epoch 48/50 200/200 - 1s - 3ms/step - accuracy: 0.7845 - loss: 0.9180 - val_accuracy: 0.7731 - val_loss: 0.4593 Epoch 49/50 200/200 - 1s - 3ms/step - accuracy: 0.7845 - loss: 0.9172 - val_accuracy: 0.7738 - val_loss: 0.4589 Epoch 50/50 200/200 - 1s - 3ms/step - accuracy: 0.7850 - loss: 0.9163 - val_accuracy: 0.7738 - val_loss: 0.4583
print("Time taken in seconds ",end-start)
Time taken in seconds 35.23708438873291
plot(history3,'loss')
plot(history3,'accuracy')
model3_train_perf = model_performance_classification(model3,X_train,y_train)
model3_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.789687 | 0.789687 | 0.840805 | 0.804722 |
model3_val_perf = model_performance_classification(model3,X_val,y_val)
model3_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.77375 | 0.77375 | 0.833078 | 0.79096 |
Observations:
- Model3 has training recall of 78.9% and validation recall of 77%
- Number of parameters is 281
- Train and validation recall performance are close and model generalises well on unseen data.
- Model3 performance is not that well as other models
But Let's try couple more changes to the model and analyse furthers
- Let's batchsize this model and see next
Model4: Neural Network of 14,7 and 1 layers with batch size ,class weights¶
Build Neural Network
- two hidden layers with 14 and 7 neurons respectively
- activation function of ReLU.
- SGD as the optimizer
-** Class Weights used**
- epochs=50
- batchsize = 64
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7,1
model4 = Sequential()
#First Hidden Layer
model4.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
#Second Hidden Layer
model4.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model4.add(Dense(1,activation="sigmoid"))
model4.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer to be used
model4.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history4 = model4.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),batch_size=batch_size,class_weight=cw_dict,verbose=2)
end=time.time()
Epoch 1/50 100/100 - 1s - 13ms/step - accuracy: 0.7422 - loss: 1.4103 - val_accuracy: 0.6956 - val_loss: 0.6667 Epoch 2/50 100/100 - 1s - 7ms/step - accuracy: 0.6241 - loss: 1.3718 - val_accuracy: 0.6256 - val_loss: 0.6783 Epoch 3/50 100/100 - 0s - 3ms/step - accuracy: 0.6100 - loss: 1.3488 - val_accuracy: 0.6394 - val_loss: 0.6728 Epoch 4/50 100/100 - 0s - 4ms/step - accuracy: 0.6341 - loss: 1.3233 - val_accuracy: 0.6569 - val_loss: 0.6587 Epoch 5/50 100/100 - 1s - 6ms/step - accuracy: 0.6573 - loss: 1.2946 - val_accuracy: 0.6737 - val_loss: 0.6392 Epoch 6/50 100/100 - 0s - 4ms/step - accuracy: 0.6773 - loss: 1.2624 - val_accuracy: 0.6900 - val_loss: 0.6180 Epoch 7/50 100/100 - 0s - 3ms/step - accuracy: 0.6964 - loss: 1.2292 - val_accuracy: 0.7031 - val_loss: 0.5980 Epoch 8/50 100/100 - 0s - 4ms/step - accuracy: 0.7100 - loss: 1.1991 - val_accuracy: 0.7175 - val_loss: 0.5814 Epoch 9/50 100/100 - 0s - 4ms/step - accuracy: 0.7164 - loss: 1.1750 - val_accuracy: 0.7237 - val_loss: 0.5693 Epoch 10/50 100/100 - 1s - 7ms/step - accuracy: 0.7200 - loss: 1.1565 - val_accuracy: 0.7250 - val_loss: 0.5626 Epoch 11/50 100/100 - 1s - 6ms/step - accuracy: 0.7206 - loss: 1.1419 - val_accuracy: 0.7169 - val_loss: 0.5578 Epoch 12/50 100/100 - 1s - 7ms/step - accuracy: 0.7222 - loss: 1.1302 - val_accuracy: 0.7188 - val_loss: 0.5534 Epoch 13/50 100/100 - 1s - 6ms/step - accuracy: 0.7202 - loss: 1.1200 - val_accuracy: 0.7219 - val_loss: 0.5492 Epoch 14/50 100/100 - 1s - 5ms/step - accuracy: 0.7223 - loss: 1.1104 - val_accuracy: 0.7194 - val_loss: 0.5454 Epoch 15/50 100/100 - 0s - 4ms/step - accuracy: 0.7234 - loss: 1.1011 - val_accuracy: 0.7163 - val_loss: 0.5415 Epoch 16/50 100/100 - 1s - 5ms/step - accuracy: 0.7256 - loss: 1.0915 - val_accuracy: 0.7181 - val_loss: 0.5383 Epoch 17/50 100/100 - 0s - 4ms/step - accuracy: 0.7280 - loss: 1.0823 - val_accuracy: 0.7181 - val_loss: 0.5344 Epoch 18/50 100/100 - 0s - 3ms/step - accuracy: 0.7311 - loss: 1.0732 - val_accuracy: 0.7181 - val_loss: 0.5310 Epoch 19/50 100/100 - 0s - 3ms/step - accuracy: 0.7350 - loss: 1.0642 - val_accuracy: 0.7225 - val_loss: 0.5275 Epoch 20/50 100/100 - 0s - 4ms/step - accuracy: 0.7394 - loss: 1.0551 - val_accuracy: 0.7250 - val_loss: 0.5244 Epoch 21/50 100/100 - 0s - 4ms/step - accuracy: 0.7439 - loss: 1.0463 - val_accuracy: 0.7262 - val_loss: 0.5206 Epoch 22/50 100/100 - 0s - 3ms/step - accuracy: 0.7464 - loss: 1.0375 - val_accuracy: 0.7331 - val_loss: 0.5170 Epoch 23/50 100/100 - 1s - 6ms/step - accuracy: 0.7483 - loss: 1.0287 - val_accuracy: 0.7375 - val_loss: 0.5140 Epoch 24/50 100/100 - 1s - 7ms/step - accuracy: 0.7511 - loss: 1.0204 - val_accuracy: 0.7419 - val_loss: 0.5110 Epoch 25/50 100/100 - 0s - 3ms/step - accuracy: 0.7534 - loss: 1.0124 - val_accuracy: 0.7425 - val_loss: 0.5082 Epoch 26/50 100/100 - 1s - 6ms/step - accuracy: 0.7556 - loss: 1.0048 - val_accuracy: 0.7419 - val_loss: 0.5051 Epoch 27/50 100/100 - 1s - 6ms/step - accuracy: 0.7564 - loss: 0.9978 - val_accuracy: 0.7425 - val_loss: 0.5026 Epoch 28/50 100/100 - 1s - 6ms/step - accuracy: 0.7570 - loss: 0.9912 - val_accuracy: 0.7406 - val_loss: 0.4989 Epoch 29/50 100/100 - 0s - 3ms/step - accuracy: 0.7617 - loss: 0.9850 - val_accuracy: 0.7431 - val_loss: 0.4964 Epoch 30/50 100/100 - 1s - 7ms/step - accuracy: 0.7644 - loss: 0.9797 - val_accuracy: 0.7462 - val_loss: 0.4944 Epoch 31/50 100/100 - 1s - 5ms/step - accuracy: 0.7658 - loss: 0.9751 - val_accuracy: 0.7444 - val_loss: 0.4923 Epoch 32/50 100/100 - 0s - 3ms/step - accuracy: 0.7669 - loss: 0.9710 - val_accuracy: 0.7469 - val_loss: 0.4902 Epoch 33/50 100/100 - 1s - 6ms/step - accuracy: 0.7688 - loss: 0.9673 - val_accuracy: 0.7475 - val_loss: 0.4886 Epoch 34/50 100/100 - 0s - 3ms/step - accuracy: 0.7691 - loss: 0.9639 - val_accuracy: 0.7500 - val_loss: 0.4871 Epoch 35/50 100/100 - 0s - 5ms/step - accuracy: 0.7705 - loss: 0.9608 - val_accuracy: 0.7519 - val_loss: 0.4859 Epoch 36/50 100/100 - 1s - 6ms/step - accuracy: 0.7700 - loss: 0.9579 - val_accuracy: 0.7500 - val_loss: 0.4849 Epoch 37/50 100/100 - 1s - 7ms/step - accuracy: 0.7719 - loss: 0.9553 - val_accuracy: 0.7500 - val_loss: 0.4837 Epoch 38/50 100/100 - 1s - 5ms/step - accuracy: 0.7731 - loss: 0.9529 - val_accuracy: 0.7500 - val_loss: 0.4831 Epoch 39/50 100/100 - 1s - 5ms/step - accuracy: 0.7745 - loss: 0.9506 - val_accuracy: 0.7494 - val_loss: 0.4828 Epoch 40/50 100/100 - 0s - 5ms/step - accuracy: 0.7753 - loss: 0.9485 - val_accuracy: 0.7494 - val_loss: 0.4817 Epoch 41/50 100/100 - 0s - 4ms/step - accuracy: 0.7775 - loss: 0.9464 - val_accuracy: 0.7494 - val_loss: 0.4816 Epoch 42/50 100/100 - 0s - 3ms/step - accuracy: 0.7777 - loss: 0.9445 - val_accuracy: 0.7469 - val_loss: 0.4811 Epoch 43/50 100/100 - 0s - 4ms/step - accuracy: 0.7789 - loss: 0.9427 - val_accuracy: 0.7469 - val_loss: 0.4807 Epoch 44/50 100/100 - 0s - 4ms/step - accuracy: 0.7789 - loss: 0.9411 - val_accuracy: 0.7469 - val_loss: 0.4802 Epoch 45/50 100/100 - 1s - 6ms/step - accuracy: 0.7797 - loss: 0.9395 - val_accuracy: 0.7481 - val_loss: 0.4797 Epoch 46/50 100/100 - 0s - 4ms/step - accuracy: 0.7800 - loss: 0.9380 - val_accuracy: 0.7506 - val_loss: 0.4795 Epoch 47/50 100/100 - 0s - 3ms/step - accuracy: 0.7806 - loss: 0.9366 - val_accuracy: 0.7531 - val_loss: 0.4787 Epoch 48/50 100/100 - 1s - 6ms/step - accuracy: 0.7812 - loss: 0.9353 - val_accuracy: 0.7556 - val_loss: 0.4774 Epoch 49/50 100/100 - 0s - 3ms/step - accuracy: 0.7817 - loss: 0.9339 - val_accuracy: 0.7569 - val_loss: 0.4765 Epoch 50/50 100/100 - 1s - 6ms/step - accuracy: 0.7828 - loss: 0.9326 - val_accuracy: 0.7569 - val_loss: 0.4757
print("Time taken in seconds ",end-start)
Time taken in seconds 25.282581329345703
plot(history4,'loss')
plot(history4,'accuracy')
model4_train_perf = model_performance_classification(model4,X_train,y_train)
model4_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.7775 | 0.7775 | 0.837294 | 0.794541 |
model4_val_perf = model_performance_classification(model4,X_val,y_val)
model4_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.756875 | 0.756875 | 0.826982 | 0.776645 |
Observations:
- Model3 has training recall of 77.8% and validation recall of around 75.6%
- Time taken to run this model is faster than Model2 as this model is done in batches
- Number of parameters is 281
- Train and validation recall performance are not that good.
- Model4 performance is not that good as Model3 and other models.
Model5: Neural Network of 32,16 and 1 layers with batchsize,class weights¶
Build Neural Network
- two hidden layers with 32 and 16 neurons respectively
- activation function of ReLU.
- SGD as the optimizer
-** Class Weights used**
- epochs=50
- batchsize = 64
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 32,16 and 1
model5 = Sequential()
#First Hidden Layer
model5.add(Dense(32,activation="relu",input_dim=X_train.shape[1]))
#Second Hidden Layer
model5.add(Dense(16,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model5.add(Dense(1,activation="sigmoid"))
model5.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 32) │ 384 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 16) │ 528 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 17 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 929 (3.63 KB)
Trainable params: 929 (3.63 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer to be used
model5.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history5 = model5.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),class_weight=cw_dict,batch_size=batch_size,verbose=2)
end = time.time()
Epoch 1/50 100/100 - 2s - 15ms/step - accuracy: 0.6127 - loss: 1.3241 - val_accuracy: 0.6525 - val_loss: 0.6546 Epoch 2/50 100/100 - 0s - 5ms/step - accuracy: 0.6791 - loss: 1.2598 - val_accuracy: 0.6994 - val_loss: 0.6281 Epoch 3/50 100/100 - 0s - 3ms/step - accuracy: 0.7114 - loss: 1.2093 - val_accuracy: 0.7181 - val_loss: 0.6057 Epoch 4/50 100/100 - 0s - 3ms/step - accuracy: 0.7269 - loss: 1.1694 - val_accuracy: 0.7250 - val_loss: 0.5875 Epoch 5/50 100/100 - 0s - 3ms/step - accuracy: 0.7339 - loss: 1.1376 - val_accuracy: 0.7281 - val_loss: 0.5729 Epoch 6/50 100/100 - 1s - 7ms/step - accuracy: 0.7392 - loss: 1.1110 - val_accuracy: 0.7287 - val_loss: 0.5615 Epoch 7/50 100/100 - 1s - 5ms/step - accuracy: 0.7423 - loss: 1.0875 - val_accuracy: 0.7350 - val_loss: 0.5509 Epoch 8/50 100/100 - 1s - 6ms/step - accuracy: 0.7461 - loss: 1.0666 - val_accuracy: 0.7369 - val_loss: 0.5417 Epoch 9/50 100/100 - 1s - 6ms/step - accuracy: 0.7491 - loss: 1.0480 - val_accuracy: 0.7412 - val_loss: 0.5334 Epoch 10/50 100/100 - 1s - 7ms/step - accuracy: 0.7522 - loss: 1.0316 - val_accuracy: 0.7400 - val_loss: 0.5269 Epoch 11/50 100/100 - 1s - 5ms/step - accuracy: 0.7553 - loss: 1.0173 - val_accuracy: 0.7362 - val_loss: 0.5207 Epoch 12/50 100/100 - 0s - 3ms/step - accuracy: 0.7577 - loss: 1.0050 - val_accuracy: 0.7406 - val_loss: 0.5152 Epoch 13/50 100/100 - 0s - 3ms/step - accuracy: 0.7569 - loss: 0.9944 - val_accuracy: 0.7437 - val_loss: 0.5108 Epoch 14/50 100/100 - 1s - 6ms/step - accuracy: 0.7589 - loss: 0.9855 - val_accuracy: 0.7437 - val_loss: 0.5066 Epoch 15/50 100/100 - 1s - 7ms/step - accuracy: 0.7606 - loss: 0.9778 - val_accuracy: 0.7469 - val_loss: 0.5029 Epoch 16/50 100/100 - 0s - 3ms/step - accuracy: 0.7617 - loss: 0.9713 - val_accuracy: 0.7462 - val_loss: 0.4996 Epoch 17/50 100/100 - 0s - 3ms/step - accuracy: 0.7623 - loss: 0.9659 - val_accuracy: 0.7456 - val_loss: 0.4967 Epoch 18/50 100/100 - 0s - 3ms/step - accuracy: 0.7631 - loss: 0.9611 - val_accuracy: 0.7469 - val_loss: 0.4944 Epoch 19/50 100/100 - 1s - 6ms/step - accuracy: 0.7641 - loss: 0.9570 - val_accuracy: 0.7475 - val_loss: 0.4924 Epoch 20/50 100/100 - 1s - 6ms/step - accuracy: 0.7653 - loss: 0.9532 - val_accuracy: 0.7506 - val_loss: 0.4902 Epoch 21/50 100/100 - 0s - 3ms/step - accuracy: 0.7670 - loss: 0.9499 - val_accuracy: 0.7500 - val_loss: 0.4883 Epoch 22/50 100/100 - 1s - 7ms/step - accuracy: 0.7683 - loss: 0.9467 - val_accuracy: 0.7519 - val_loss: 0.4864 Epoch 23/50 100/100 - 1s - 6ms/step - accuracy: 0.7697 - loss: 0.9435 - val_accuracy: 0.7531 - val_loss: 0.4842 Epoch 24/50 100/100 - 0s - 3ms/step - accuracy: 0.7711 - loss: 0.9406 - val_accuracy: 0.7581 - val_loss: 0.4825 Epoch 25/50 100/100 - 1s - 7ms/step - accuracy: 0.7717 - loss: 0.9376 - val_accuracy: 0.7600 - val_loss: 0.4807 Epoch 26/50 100/100 - 0s - 3ms/step - accuracy: 0.7728 - loss: 0.9348 - val_accuracy: 0.7606 - val_loss: 0.4794 Epoch 27/50 100/100 - 0s - 4ms/step - accuracy: 0.7730 - loss: 0.9322 - val_accuracy: 0.7631 - val_loss: 0.4777 Epoch 28/50 100/100 - 0s - 3ms/step - accuracy: 0.7745 - loss: 0.9296 - val_accuracy: 0.7625 - val_loss: 0.4768 Epoch 29/50 100/100 - 1s - 7ms/step - accuracy: 0.7752 - loss: 0.9273 - val_accuracy: 0.7625 - val_loss: 0.4756 Epoch 30/50 100/100 - 0s - 3ms/step - accuracy: 0.7761 - loss: 0.9251 - val_accuracy: 0.7650 - val_loss: 0.4742 Epoch 31/50 100/100 - 1s - 7ms/step - accuracy: 0.7772 - loss: 0.9231 - val_accuracy: 0.7675 - val_loss: 0.4731 Epoch 32/50 100/100 - 1s - 7ms/step - accuracy: 0.7781 - loss: 0.9212 - val_accuracy: 0.7681 - val_loss: 0.4723 Epoch 33/50 100/100 - 1s - 6ms/step - accuracy: 0.7789 - loss: 0.9193 - val_accuracy: 0.7656 - val_loss: 0.4713 Epoch 34/50 100/100 - 0s - 5ms/step - accuracy: 0.7789 - loss: 0.9175 - val_accuracy: 0.7663 - val_loss: 0.4704 Epoch 35/50 100/100 - 0s - 5ms/step - accuracy: 0.7803 - loss: 0.9159 - val_accuracy: 0.7681 - val_loss: 0.4697 Epoch 36/50 100/100 - 0s - 5ms/step - accuracy: 0.7806 - loss: 0.9143 - val_accuracy: 0.7688 - val_loss: 0.4691 Epoch 37/50 100/100 - 0s - 3ms/step - accuracy: 0.7822 - loss: 0.9127 - val_accuracy: 0.7688 - val_loss: 0.4686 Epoch 38/50 100/100 - 0s - 4ms/step - accuracy: 0.7834 - loss: 0.9113 - val_accuracy: 0.7713 - val_loss: 0.4680 Epoch 39/50 100/100 - 0s - 3ms/step - accuracy: 0.7839 - loss: 0.9098 - val_accuracy: 0.7725 - val_loss: 0.4673 Epoch 40/50 100/100 - 0s - 3ms/step - accuracy: 0.7844 - loss: 0.9084 - val_accuracy: 0.7725 - val_loss: 0.4668 Epoch 41/50 100/100 - 1s - 7ms/step - accuracy: 0.7845 - loss: 0.9070 - val_accuracy: 0.7725 - val_loss: 0.4658 Epoch 42/50 100/100 - 1s - 5ms/step - accuracy: 0.7855 - loss: 0.9057 - val_accuracy: 0.7731 - val_loss: 0.4650 Epoch 43/50 100/100 - 0s - 3ms/step - accuracy: 0.7861 - loss: 0.9044 - val_accuracy: 0.7731 - val_loss: 0.4646 Epoch 44/50 100/100 - 0s - 3ms/step - accuracy: 0.7870 - loss: 0.9032 - val_accuracy: 0.7738 - val_loss: 0.4640 Epoch 45/50 100/100 - 1s - 7ms/step - accuracy: 0.7881 - loss: 0.9020 - val_accuracy: 0.7738 - val_loss: 0.4638 Epoch 46/50 100/100 - 1s - 5ms/step - accuracy: 0.7880 - loss: 0.9009 - val_accuracy: 0.7738 - val_loss: 0.4635 Epoch 47/50 100/100 - 0s - 3ms/step - accuracy: 0.7884 - loss: 0.8998 - val_accuracy: 0.7750 - val_loss: 0.4628 Epoch 48/50 100/100 - 0s - 4ms/step - accuracy: 0.7887 - loss: 0.8987 - val_accuracy: 0.7756 - val_loss: 0.4623 Epoch 49/50 100/100 - 0s - 4ms/step - accuracy: 0.7898 - loss: 0.8976 - val_accuracy: 0.7763 - val_loss: 0.4618 Epoch 50/50 100/100 - 0s - 3ms/step - accuracy: 0.7906 - loss: 0.8965 - val_accuracy: 0.7775 - val_loss: 0.4612
print("Time taken in seconds ",end-start)
Time taken in seconds 25.097063302993774
plot(history5,'loss')
plot(history5,'accuracy')
model5_train_perf = model_performance_classification(model5,X_train,y_train)
model5_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.790937 | 0.790937 | 0.843698 | 0.806151 |
model5_val_perf = model_performance_classification(model5,X_val,y_val)
model5_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.7775 | 0.7775 | 0.833179 | 0.793943 |
Observations:
- Model5 has training recall of around 79% and validation recall of 77%.
- Number of parameters increaded to 929
- Train and validation recall performance are not that good.
- Model5 performance is not that good compared to other models
- Changing the node size (number of neurons) did not help much with performance.
- model5 is also not performing well
- we will use the 14,7 hidden layers neural network for further improvement
We will use neural network of 14,7,1, batchsize=64,epochs=50, and without class weights to optimize further and make improvement
Model Performance Improvement¶
Model6: Neural Network with Adam Optimizer¶
- Let's change the optimizer to Adam
- This will introduce momentum as well as an adaptive learning rate
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7,1
model6 = Sequential()
#First Hidden Layer
model6.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
#Second Hidden Layer
model6.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model6.add(Dense(1,activation="sigmoid"))
model6.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model6.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history6 = model6.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),batch_size=batch_size,verbose=2)
end=time.time()
Epoch 1/50 100/100 - 2s - 21ms/step - accuracy: 0.7786 - loss: 0.5406 - val_accuracy: 0.7969 - val_loss: 0.4902 Epoch 2/50 100/100 - 1s - 5ms/step - accuracy: 0.7977 - loss: 0.4623 - val_accuracy: 0.7994 - val_loss: 0.4531 Epoch 3/50 100/100 - 0s - 5ms/step - accuracy: 0.8045 - loss: 0.4362 - val_accuracy: 0.8081 - val_loss: 0.4344 Epoch 4/50 100/100 - 1s - 6ms/step - accuracy: 0.8125 - loss: 0.4216 - val_accuracy: 0.8175 - val_loss: 0.4231 Epoch 5/50 100/100 - 1s - 7ms/step - accuracy: 0.8183 - loss: 0.4108 - val_accuracy: 0.8200 - val_loss: 0.4139 Epoch 6/50 100/100 - 1s - 5ms/step - accuracy: 0.8272 - loss: 0.3995 - val_accuracy: 0.8294 - val_loss: 0.4032 Epoch 7/50 100/100 - 0s - 4ms/step - accuracy: 0.8348 - loss: 0.3861 - val_accuracy: 0.8419 - val_loss: 0.3922 Epoch 8/50 100/100 - 1s - 5ms/step - accuracy: 0.8395 - loss: 0.3766 - val_accuracy: 0.8475 - val_loss: 0.3837 Epoch 9/50 100/100 - 0s - 4ms/step - accuracy: 0.8461 - loss: 0.3699 - val_accuracy: 0.8544 - val_loss: 0.3777 Epoch 10/50 100/100 - 1s - 5ms/step - accuracy: 0.8484 - loss: 0.3648 - val_accuracy: 0.8556 - val_loss: 0.3731 Epoch 11/50 100/100 - 1s - 6ms/step - accuracy: 0.8511 - loss: 0.3611 - val_accuracy: 0.8544 - val_loss: 0.3697 Epoch 12/50 100/100 - 1s - 6ms/step - accuracy: 0.8517 - loss: 0.3581 - val_accuracy: 0.8556 - val_loss: 0.3669 Epoch 13/50 100/100 - 0s - 3ms/step - accuracy: 0.8541 - loss: 0.3557 - val_accuracy: 0.8581 - val_loss: 0.3644 Epoch 14/50 100/100 - 1s - 6ms/step - accuracy: 0.8533 - loss: 0.3537 - val_accuracy: 0.8575 - val_loss: 0.3627 Epoch 15/50 100/100 - 0s - 4ms/step - accuracy: 0.8547 - loss: 0.3520 - val_accuracy: 0.8581 - val_loss: 0.3612 Epoch 16/50 100/100 - 0s - 3ms/step - accuracy: 0.8545 - loss: 0.3505 - val_accuracy: 0.8575 - val_loss: 0.3601 Epoch 17/50 100/100 - 0s - 3ms/step - accuracy: 0.8550 - loss: 0.3492 - val_accuracy: 0.8587 - val_loss: 0.3590 Epoch 18/50 100/100 - 0s - 3ms/step - accuracy: 0.8558 - loss: 0.3481 - val_accuracy: 0.8594 - val_loss: 0.3579 Epoch 19/50 100/100 - 1s - 7ms/step - accuracy: 0.8564 - loss: 0.3471 - val_accuracy: 0.8600 - val_loss: 0.3571 Epoch 20/50 100/100 - 1s - 5ms/step - accuracy: 0.8569 - loss: 0.3462 - val_accuracy: 0.8594 - val_loss: 0.3565 Epoch 21/50 100/100 - 1s - 6ms/step - accuracy: 0.8577 - loss: 0.3454 - val_accuracy: 0.8581 - val_loss: 0.3559 Epoch 22/50 100/100 - 0s - 3ms/step - accuracy: 0.8581 - loss: 0.3446 - val_accuracy: 0.8575 - val_loss: 0.3554 Epoch 23/50 100/100 - 0s - 3ms/step - accuracy: 0.8580 - loss: 0.3439 - val_accuracy: 0.8569 - val_loss: 0.3549 Epoch 24/50 100/100 - 0s - 4ms/step - accuracy: 0.8581 - loss: 0.3431 - val_accuracy: 0.8569 - val_loss: 0.3544 Epoch 25/50 100/100 - 0s - 4ms/step - accuracy: 0.8586 - loss: 0.3424 - val_accuracy: 0.8569 - val_loss: 0.3540 Epoch 26/50 100/100 - 1s - 5ms/step - accuracy: 0.8594 - loss: 0.3417 - val_accuracy: 0.8569 - val_loss: 0.3536 Epoch 27/50 100/100 - 0s - 4ms/step - accuracy: 0.8595 - loss: 0.3411 - val_accuracy: 0.8575 - val_loss: 0.3534 Epoch 28/50 100/100 - 1s - 7ms/step - accuracy: 0.8594 - loss: 0.3405 - val_accuracy: 0.8581 - val_loss: 0.3532 Epoch 29/50 100/100 - 1s - 5ms/step - accuracy: 0.8594 - loss: 0.3399 - val_accuracy: 0.8587 - val_loss: 0.3532 Epoch 30/50 100/100 - 1s - 6ms/step - accuracy: 0.8600 - loss: 0.3394 - val_accuracy: 0.8587 - val_loss: 0.3530 Epoch 31/50 100/100 - 1s - 6ms/step - accuracy: 0.8605 - loss: 0.3388 - val_accuracy: 0.8587 - val_loss: 0.3530 Epoch 32/50 100/100 - 1s - 6ms/step - accuracy: 0.8608 - loss: 0.3383 - val_accuracy: 0.8575 - val_loss: 0.3529 Epoch 33/50 100/100 - 0s - 4ms/step - accuracy: 0.8609 - loss: 0.3378 - val_accuracy: 0.8606 - val_loss: 0.3529 Epoch 34/50 100/100 - 1s - 6ms/step - accuracy: 0.8603 - loss: 0.3373 - val_accuracy: 0.8600 - val_loss: 0.3529 Epoch 35/50 100/100 - 0s - 3ms/step - accuracy: 0.8609 - loss: 0.3368 - val_accuracy: 0.8600 - val_loss: 0.3528 Epoch 36/50 100/100 - 1s - 6ms/step - accuracy: 0.8609 - loss: 0.3364 - val_accuracy: 0.8606 - val_loss: 0.3527 Epoch 37/50 100/100 - 1s - 6ms/step - accuracy: 0.8614 - loss: 0.3360 - val_accuracy: 0.8600 - val_loss: 0.3527 Epoch 38/50 100/100 - 0s - 3ms/step - accuracy: 0.8617 - loss: 0.3356 - val_accuracy: 0.8606 - val_loss: 0.3527 Epoch 39/50 100/100 - 0s - 3ms/step - accuracy: 0.8622 - loss: 0.3352 - val_accuracy: 0.8606 - val_loss: 0.3527 Epoch 40/50 100/100 - 0s - 3ms/step - accuracy: 0.8622 - loss: 0.3348 - val_accuracy: 0.8606 - val_loss: 0.3527 Epoch 41/50 100/100 - 1s - 6ms/step - accuracy: 0.8622 - loss: 0.3344 - val_accuracy: 0.8600 - val_loss: 0.3526 Epoch 42/50 100/100 - 0s - 3ms/step - accuracy: 0.8623 - loss: 0.3340 - val_accuracy: 0.8606 - val_loss: 0.3526 Epoch 43/50 100/100 - 0s - 3ms/step - accuracy: 0.8623 - loss: 0.3337 - val_accuracy: 0.8612 - val_loss: 0.3526 Epoch 44/50 100/100 - 0s - 3ms/step - accuracy: 0.8630 - loss: 0.3334 - val_accuracy: 0.8612 - val_loss: 0.3525 Epoch 45/50 100/100 - 0s - 3ms/step - accuracy: 0.8627 - loss: 0.3331 - val_accuracy: 0.8600 - val_loss: 0.3523 Epoch 46/50 100/100 - 1s - 7ms/step - accuracy: 0.8633 - loss: 0.3327 - val_accuracy: 0.8600 - val_loss: 0.3522 Epoch 47/50 100/100 - 0s - 3ms/step - accuracy: 0.8636 - loss: 0.3324 - val_accuracy: 0.8594 - val_loss: 0.3521 Epoch 48/50 100/100 - 1s - 6ms/step - accuracy: 0.8641 - loss: 0.3320 - val_accuracy: 0.8594 - val_loss: 0.3520 Epoch 49/50 100/100 - 0s - 4ms/step - accuracy: 0.8639 - loss: 0.3317 - val_accuracy: 0.8587 - val_loss: 0.3519 Epoch 50/50 100/100 - 0s - 3ms/step - accuracy: 0.8642 - loss: 0.3315 - val_accuracy: 0.8594 - val_loss: 0.3518
print("Time taken in seconds ",end-start)
Time taken in seconds 25.39774799346924
plot(history6,'loss')
plot(history6,'accuracy')
model6_train_perf = model_performance_classification(model6,X_train,y_train)
model6_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.864219 | 0.864219 | 0.855599 | 0.851926 |
model6_val_perf = model_performance_classification(model6,X_val,y_val)
model6_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.859375 | 0.859375 | 0.849714 | 0.845986 |
Observations:
- Model6 has training recall of 86.4% and validation recall of 85.9%
- Number of parameters is 281
- Train and validation recall performance are close and model generalises well on unseen data.
Model7: Neural Network with Adam Optimizer and Dropout¶
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7,1
model7 = Sequential()
#First Hidden Layer
model7.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
#Adding Dropout to prevent overfitting
model7.add(Dropout(0.5))
#Second Hidden Layer
model7.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model7.add(Dense(1,activation="sigmoid"))
model7.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dropout (Dropout) │ (None, 14) │ 0 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model7.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history7 = model7.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),batch_size=batch_size,verbose=2)
end=time.time()
Epoch 1/50 100/100 - 3s - 34ms/step - accuracy: 0.5678 - loss: 0.6990 - val_accuracy: 0.7775 - val_loss: 0.5708 Epoch 2/50 100/100 - 0s - 5ms/step - accuracy: 0.7891 - loss: 0.5471 - val_accuracy: 0.7962 - val_loss: 0.4809 Epoch 3/50 100/100 - 0s - 3ms/step - accuracy: 0.7967 - loss: 0.4998 - val_accuracy: 0.7962 - val_loss: 0.4571 Epoch 4/50 100/100 - 0s - 3ms/step - accuracy: 0.7984 - loss: 0.4763 - val_accuracy: 0.7962 - val_loss: 0.4469 Epoch 5/50 100/100 - 1s - 6ms/step - accuracy: 0.7984 - loss: 0.4678 - val_accuracy: 0.7962 - val_loss: 0.4414 Epoch 6/50 100/100 - 0s - 3ms/step - accuracy: 0.7992 - loss: 0.4604 - val_accuracy: 0.7987 - val_loss: 0.4368 Epoch 7/50 100/100 - 0s - 3ms/step - accuracy: 0.8036 - loss: 0.4523 - val_accuracy: 0.8031 - val_loss: 0.4316 Epoch 8/50 100/100 - 1s - 6ms/step - accuracy: 0.8028 - loss: 0.4471 - val_accuracy: 0.8069 - val_loss: 0.4263 Epoch 9/50 100/100 - 0s - 3ms/step - accuracy: 0.8061 - loss: 0.4395 - val_accuracy: 0.8069 - val_loss: 0.4207 Epoch 10/50 100/100 - 0s - 3ms/step - accuracy: 0.8081 - loss: 0.4374 - val_accuracy: 0.8075 - val_loss: 0.4160 Epoch 11/50 100/100 - 0s - 3ms/step - accuracy: 0.8147 - loss: 0.4273 - val_accuracy: 0.8144 - val_loss: 0.4103 Epoch 12/50 100/100 - 1s - 6ms/step - accuracy: 0.8089 - loss: 0.4273 - val_accuracy: 0.8175 - val_loss: 0.4064 Epoch 13/50 100/100 - 1s - 6ms/step - accuracy: 0.8114 - loss: 0.4272 - val_accuracy: 0.8213 - val_loss: 0.4030 Epoch 14/50 100/100 - 0s - 4ms/step - accuracy: 0.8181 - loss: 0.4199 - val_accuracy: 0.8313 - val_loss: 0.3987 Epoch 15/50 100/100 - 0s - 4ms/step - accuracy: 0.8200 - loss: 0.4186 - val_accuracy: 0.8356 - val_loss: 0.3946 Epoch 16/50 100/100 - 0s - 3ms/step - accuracy: 0.8211 - loss: 0.4110 - val_accuracy: 0.8350 - val_loss: 0.3904 Epoch 17/50 100/100 - 0s - 3ms/step - accuracy: 0.8234 - loss: 0.4084 - val_accuracy: 0.8394 - val_loss: 0.3884 Epoch 18/50 100/100 - 1s - 7ms/step - accuracy: 0.8231 - loss: 0.4077 - val_accuracy: 0.8431 - val_loss: 0.3869 Epoch 19/50 100/100 - 1s - 6ms/step - accuracy: 0.8216 - loss: 0.4081 - val_accuracy: 0.8438 - val_loss: 0.3841 Epoch 20/50 100/100 - 0s - 3ms/step - accuracy: 0.8239 - loss: 0.4056 - val_accuracy: 0.8438 - val_loss: 0.3816 Epoch 21/50 100/100 - 1s - 7ms/step - accuracy: 0.8219 - loss: 0.4038 - val_accuracy: 0.8444 - val_loss: 0.3790 Epoch 22/50 100/100 - 0s - 3ms/step - accuracy: 0.8245 - loss: 0.4061 - val_accuracy: 0.8462 - val_loss: 0.3778 Epoch 23/50 100/100 - 1s - 6ms/step - accuracy: 0.8277 - loss: 0.3985 - val_accuracy: 0.8456 - val_loss: 0.3758 Epoch 24/50 100/100 - 1s - 8ms/step - accuracy: 0.8283 - loss: 0.3944 - val_accuracy: 0.8456 - val_loss: 0.3736 Epoch 25/50 100/100 - 0s - 5ms/step - accuracy: 0.8228 - loss: 0.4014 - val_accuracy: 0.8462 - val_loss: 0.3737 Epoch 26/50 100/100 - 0s - 5ms/step - accuracy: 0.8264 - loss: 0.4022 - val_accuracy: 0.8462 - val_loss: 0.3726 Epoch 27/50 100/100 - 1s - 6ms/step - accuracy: 0.8295 - loss: 0.3954 - val_accuracy: 0.8462 - val_loss: 0.3717 Epoch 28/50 100/100 - 1s - 7ms/step - accuracy: 0.8308 - loss: 0.3988 - val_accuracy: 0.8462 - val_loss: 0.3714 Epoch 29/50 100/100 - 0s - 5ms/step - accuracy: 0.8317 - loss: 0.3995 - val_accuracy: 0.8475 - val_loss: 0.3722 Epoch 30/50 100/100 - 0s - 5ms/step - accuracy: 0.8313 - loss: 0.3957 - val_accuracy: 0.8450 - val_loss: 0.3709 Epoch 31/50 100/100 - 0s - 3ms/step - accuracy: 0.8283 - loss: 0.4007 - val_accuracy: 0.8494 - val_loss: 0.3696 Epoch 32/50 100/100 - 0s - 3ms/step - accuracy: 0.8325 - loss: 0.3905 - val_accuracy: 0.8519 - val_loss: 0.3684 Epoch 33/50 100/100 - 0s - 3ms/step - accuracy: 0.8302 - loss: 0.3928 - val_accuracy: 0.8544 - val_loss: 0.3679 Epoch 34/50 100/100 - 1s - 6ms/step - accuracy: 0.8330 - loss: 0.3938 - val_accuracy: 0.8575 - val_loss: 0.3673 Epoch 35/50 100/100 - 1s - 6ms/step - accuracy: 0.8331 - loss: 0.3921 - val_accuracy: 0.8544 - val_loss: 0.3674 Epoch 36/50 100/100 - 0s - 4ms/step - accuracy: 0.8291 - loss: 0.3961 - val_accuracy: 0.8531 - val_loss: 0.3672 Epoch 37/50 100/100 - 1s - 5ms/step - accuracy: 0.8331 - loss: 0.3904 - val_accuracy: 0.8612 - val_loss: 0.3655 Epoch 38/50 100/100 - 0s - 3ms/step - accuracy: 0.8323 - loss: 0.3915 - val_accuracy: 0.8575 - val_loss: 0.3657 Epoch 39/50 100/100 - 1s - 6ms/step - accuracy: 0.8372 - loss: 0.3844 - val_accuracy: 0.8562 - val_loss: 0.3645 Epoch 40/50 100/100 - 1s - 6ms/step - accuracy: 0.8352 - loss: 0.3915 - val_accuracy: 0.8544 - val_loss: 0.3648 Epoch 41/50 100/100 - 0s - 3ms/step - accuracy: 0.8402 - loss: 0.3813 - val_accuracy: 0.8569 - val_loss: 0.3625 Epoch 42/50 100/100 - 1s - 6ms/step - accuracy: 0.8336 - loss: 0.3840 - val_accuracy: 0.8581 - val_loss: 0.3626 Epoch 43/50 100/100 - 1s - 6ms/step - accuracy: 0.8378 - loss: 0.3867 - val_accuracy: 0.8575 - val_loss: 0.3630 Epoch 44/50 100/100 - 1s - 6ms/step - accuracy: 0.8330 - loss: 0.3839 - val_accuracy: 0.8575 - val_loss: 0.3618 Epoch 45/50 100/100 - 0s - 3ms/step - accuracy: 0.8372 - loss: 0.3823 - val_accuracy: 0.8550 - val_loss: 0.3610 Epoch 46/50 100/100 - 1s - 6ms/step - accuracy: 0.8356 - loss: 0.3868 - val_accuracy: 0.8575 - val_loss: 0.3612 Epoch 47/50 100/100 - 0s - 3ms/step - accuracy: 0.8392 - loss: 0.3765 - val_accuracy: 0.8581 - val_loss: 0.3604 Epoch 48/50 100/100 - 0s - 3ms/step - accuracy: 0.8394 - loss: 0.3811 - val_accuracy: 0.8550 - val_loss: 0.3598 Epoch 49/50 100/100 - 0s - 4ms/step - accuracy: 0.8381 - loss: 0.3751 - val_accuracy: 0.8569 - val_loss: 0.3586 Epoch 50/50 100/100 - 0s - 4ms/step - accuracy: 0.8400 - loss: 0.3800 - val_accuracy: 0.8587 - val_loss: 0.3587
print("Time taken in seconds ",end-start)
Time taken in seconds 26.715410470962524
plot(history7,'loss')
plot(history7,'accuracy')
model7_train_perf = model_performance_classification(model7,X_train,y_train)
model7_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.857969 | 0.857969 | 0.848112 | 0.843839 |
model7_val_perf = model_performance_classification(model7,X_val,y_val)
model7_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.85875 | 0.85875 | 0.849153 | 0.84459 |
Observations:
- Model7 has training recall of 85.8% and validation recall of 85.9%
- Number of parameters is 281
- Train and validation recall performance are close and model generalises well on unseen data. But Let's try couple more changes to the model and analyse furthers
Balance Data with SMOTE , split test train data and Normalise Data¶
X = data.drop("Exited", axis=1)
y = data["Exited"]
# Before train-test split:
smote = SMOTE(random_state=42) # Initialize SMOTE
X_resampled, y_resampled = smote.fit_resample(X, y) # Apply SMOTE
# Splitting data into training, validation and test sets:
# first we split data into 2 parts, say temporary and test
X_btemp, X_btest, y_btemp, y_btest = train_test_split(
X_resampled, y_resampled, test_size=0.2, random_state=42, stratify=y_resampled
)
# Splitting the Train dataset into the Training and Validation set.
X_btrain, X_bval, y_btrain, y_bval = train_test_split(X_btemp, y_btemp, test_size = 0.2, random_state = 42,stratify = y_btemp)
# Check synthetic resampled target count
y_resampled.value_counts()
| count | |
|---|---|
| Exited | |
| 1 | 7963 |
| 0 | 7963 |
#Printing the shapes.
print(X_btrain.shape,y_btrain.shape)
print(X_bval.shape,y_bval.shape)
print(X_btest.shape,y_btest.shape)
(10192, 11) (10192,) (2548, 11) (2548,) (3186, 11) (3186,)
#Transform numerical features to a common scale
# defining the list of numerical features to normalize
num_features = ['CreditScore','Age','Tenure','Balance','NumOfProducts','EstimatedSalary']
scaler = StandardScaler()
X_btrain[num_features] = scaler.fit_transform(X_btrain[num_features])
X_bval[num_features] = scaler.transform(X_bval[num_features])
X_btest[num_features] = scaler.transform(X_btest[num_features])
Model8 :Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer¶
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7,1
model8 = Sequential()
#First Hidden Layer
model8.add(Dense(14,activation="relu",input_dim=X_btrain.shape[1]))
#Second Hidden Layer
model8.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model8.add(Dense(1,activation="sigmoid"))
model8.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.SGD() # defining SGD as the optimizer to be used
model8.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history8 = model8.fit(X_btrain, y_btrain, epochs=epochs, validation_data=(X_bval,y_bval),batch_size=batch_size,verbose=2)
end=time.time()
Epoch 1/50 160/160 - 2s - 15ms/step - accuracy: 0.6323 - loss: 0.6484 - val_accuracy: 0.6542 - val_loss: 0.6332 Epoch 2/50 160/160 - 1s - 3ms/step - accuracy: 0.6776 - loss: 0.6160 - val_accuracy: 0.6868 - val_loss: 0.6011 Epoch 3/50 160/160 - 0s - 3ms/step - accuracy: 0.7078 - loss: 0.5835 - val_accuracy: 0.7151 - val_loss: 0.5726 Epoch 4/50 160/160 - 1s - 4ms/step - accuracy: 0.7313 - loss: 0.5568 - val_accuracy: 0.7351 - val_loss: 0.5505 Epoch 5/50 160/160 - 1s - 4ms/step - accuracy: 0.7474 - loss: 0.5364 - val_accuracy: 0.7500 - val_loss: 0.5336 Epoch 6/50 160/160 - 1s - 3ms/step - accuracy: 0.7557 - loss: 0.5212 - val_accuracy: 0.7563 - val_loss: 0.5208 Epoch 7/50 160/160 - 1s - 3ms/step - accuracy: 0.7604 - loss: 0.5095 - val_accuracy: 0.7578 - val_loss: 0.5108 Epoch 8/50 160/160 - 0s - 3ms/step - accuracy: 0.7655 - loss: 0.5002 - val_accuracy: 0.7614 - val_loss: 0.5027 Epoch 9/50 160/160 - 0s - 3ms/step - accuracy: 0.7697 - loss: 0.4926 - val_accuracy: 0.7657 - val_loss: 0.4960 Epoch 10/50 160/160 - 1s - 4ms/step - accuracy: 0.7731 - loss: 0.4861 - val_accuracy: 0.7704 - val_loss: 0.4900 Epoch 11/50 160/160 - 1s - 4ms/step - accuracy: 0.7774 - loss: 0.4804 - val_accuracy: 0.7704 - val_loss: 0.4846 Epoch 12/50 160/160 - 0s - 3ms/step - accuracy: 0.7798 - loss: 0.4751 - val_accuracy: 0.7739 - val_loss: 0.4796 Epoch 13/50 160/160 - 1s - 4ms/step - accuracy: 0.7828 - loss: 0.4701 - val_accuracy: 0.7794 - val_loss: 0.4748 Epoch 14/50 160/160 - 1s - 4ms/step - accuracy: 0.7855 - loss: 0.4652 - val_accuracy: 0.7822 - val_loss: 0.4699 Epoch 15/50 160/160 - 1s - 5ms/step - accuracy: 0.7873 - loss: 0.4604 - val_accuracy: 0.7849 - val_loss: 0.4651 Epoch 16/50 160/160 - 1s - 8ms/step - accuracy: 0.7901 - loss: 0.4557 - val_accuracy: 0.7889 - val_loss: 0.4602 Epoch 17/50 160/160 - 1s - 4ms/step - accuracy: 0.7939 - loss: 0.4510 - val_accuracy: 0.7928 - val_loss: 0.4554 Epoch 18/50 160/160 - 1s - 3ms/step - accuracy: 0.7958 - loss: 0.4465 - val_accuracy: 0.7955 - val_loss: 0.4508 Epoch 19/50 160/160 - 1s - 4ms/step - accuracy: 0.7985 - loss: 0.4422 - val_accuracy: 0.7971 - val_loss: 0.4464 Epoch 20/50 160/160 - 1s - 4ms/step - accuracy: 0.8017 - loss: 0.4382 - val_accuracy: 0.7987 - val_loss: 0.4422 Epoch 21/50 160/160 - 0s - 3ms/step - accuracy: 0.8037 - loss: 0.4344 - val_accuracy: 0.8010 - val_loss: 0.4382 Epoch 22/50 160/160 - 1s - 4ms/step - accuracy: 0.8063 - loss: 0.4308 - val_accuracy: 0.8010 - val_loss: 0.4345 Epoch 23/50 160/160 - 0s - 3ms/step - accuracy: 0.8079 - loss: 0.4275 - val_accuracy: 0.8042 - val_loss: 0.4311 Epoch 24/50 160/160 - 0s - 3ms/step - accuracy: 0.8088 - loss: 0.4244 - val_accuracy: 0.8053 - val_loss: 0.4280 Epoch 25/50 160/160 - 1s - 4ms/step - accuracy: 0.8108 - loss: 0.4215 - val_accuracy: 0.8069 - val_loss: 0.4250 Epoch 26/50 160/160 - 0s - 3ms/step - accuracy: 0.8132 - loss: 0.4189 - val_accuracy: 0.8081 - val_loss: 0.4223 Epoch 27/50 160/160 - 1s - 4ms/step - accuracy: 0.8142 - loss: 0.4166 - val_accuracy: 0.8116 - val_loss: 0.4199 Epoch 28/50 160/160 - 1s - 4ms/step - accuracy: 0.8153 - loss: 0.4145 - val_accuracy: 0.8148 - val_loss: 0.4178 Epoch 29/50 160/160 - 0s - 3ms/step - accuracy: 0.8164 - loss: 0.4125 - val_accuracy: 0.8191 - val_loss: 0.4158 Epoch 30/50 160/160 - 1s - 4ms/step - accuracy: 0.8178 - loss: 0.4107 - val_accuracy: 0.8206 - val_loss: 0.4139 Epoch 31/50 160/160 - 0s - 3ms/step - accuracy: 0.8181 - loss: 0.4090 - val_accuracy: 0.8214 - val_loss: 0.4122 Epoch 32/50 160/160 - 1s - 4ms/step - accuracy: 0.8194 - loss: 0.4074 - val_accuracy: 0.8218 - val_loss: 0.4106 Epoch 33/50 160/160 - 0s - 3ms/step - accuracy: 0.8201 - loss: 0.4059 - val_accuracy: 0.8222 - val_loss: 0.4091 Epoch 34/50 160/160 - 1s - 4ms/step - accuracy: 0.8206 - loss: 0.4045 - val_accuracy: 0.8222 - val_loss: 0.4077 Epoch 35/50 160/160 - 1s - 3ms/step - accuracy: 0.8212 - loss: 0.4032 - val_accuracy: 0.8238 - val_loss: 0.4064 Epoch 36/50 160/160 - 1s - 5ms/step - accuracy: 0.8214 - loss: 0.4020 - val_accuracy: 0.8242 - val_loss: 0.4051 Epoch 37/50 160/160 - 1s - 9ms/step - accuracy: 0.8217 - loss: 0.4008 - val_accuracy: 0.8246 - val_loss: 0.4040 Epoch 38/50 160/160 - 1s - 6ms/step - accuracy: 0.8223 - loss: 0.3998 - val_accuracy: 0.8246 - val_loss: 0.4029 Epoch 39/50 160/160 - 0s - 3ms/step - accuracy: 0.8223 - loss: 0.3987 - val_accuracy: 0.8257 - val_loss: 0.4018 Epoch 40/50 160/160 - 0s - 3ms/step - accuracy: 0.8228 - loss: 0.3978 - val_accuracy: 0.8261 - val_loss: 0.4008 Epoch 41/50 160/160 - 1s - 4ms/step - accuracy: 0.8238 - loss: 0.3968 - val_accuracy: 0.8261 - val_loss: 0.3999 Epoch 42/50 160/160 - 1s - 3ms/step - accuracy: 0.8239 - loss: 0.3960 - val_accuracy: 0.8254 - val_loss: 0.3990 Epoch 43/50 160/160 - 1s - 4ms/step - accuracy: 0.8239 - loss: 0.3951 - val_accuracy: 0.8261 - val_loss: 0.3982 Epoch 44/50 160/160 - 1s - 3ms/step - accuracy: 0.8240 - loss: 0.3943 - val_accuracy: 0.8261 - val_loss: 0.3974 Epoch 45/50 160/160 - 1s - 4ms/step - accuracy: 0.8250 - loss: 0.3936 - val_accuracy: 0.8269 - val_loss: 0.3967 Epoch 46/50 160/160 - 0s - 3ms/step - accuracy: 0.8252 - loss: 0.3928 - val_accuracy: 0.8269 - val_loss: 0.3959 Epoch 47/50 160/160 - 1s - 4ms/step - accuracy: 0.8254 - loss: 0.3921 - val_accuracy: 0.8273 - val_loss: 0.3952 Epoch 48/50 160/160 - 0s - 3ms/step - accuracy: 0.8255 - loss: 0.3915 - val_accuracy: 0.8297 - val_loss: 0.3946 Epoch 49/50 160/160 - 1s - 3ms/step - accuracy: 0.8257 - loss: 0.3908 - val_accuracy: 0.8301 - val_loss: 0.3939 Epoch 50/50 160/160 - 1s - 3ms/step - accuracy: 0.8258 - loss: 0.3902 - val_accuracy: 0.8308 - val_loss: 0.3933
print("Time taken in seconds ",end-start)
Time taken in seconds 32.02013039588928
plot(history8,'loss')
plot(history8,'accuracy')
model8_train_perf = model_performance_classification(model8,X_btrain,y_btrain)
model8_train_perf
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.828002 | 0.828002 | 0.828018 | 0.828 |
model8_val_perf = model_performance_classification(model8,X_bval,y_bval)
model8_val_perf
80/80 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.830848 | 0.830848 | 0.830938 | 0.830836 |
Observations:
- Model8 has training recall of 82.8% and validation recall of 83%
- Number of parameters is 281
- Train and validation recall performance are close . Model generalises well.
Model9: Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer¶
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7,1
model9 = Sequential()
#First Hidden Layer
model9.add(Dense(14,activation="relu",input_dim=X_btrain.shape[1]))
#Second Hidden Layer
model9.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model9.add(Dense(1,activation="sigmoid"))
model9.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model9.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history9 = model9.fit(X_btrain, y_btrain, epochs=epochs, validation_data=(X_bval,y_bval),batch_size=batch_size,verbose=2)
end=time.time()
Epoch 1/50 160/160 - 4s - 25ms/step - accuracy: 0.6672 - loss: 0.6117 - val_accuracy: 0.7425 - val_loss: 0.5332 Epoch 2/50 160/160 - 3s - 18ms/step - accuracy: 0.7763 - loss: 0.4799 - val_accuracy: 0.7955 - val_loss: 0.4560 Epoch 3/50 160/160 - 1s - 4ms/step - accuracy: 0.8003 - loss: 0.4373 - val_accuracy: 0.8073 - val_loss: 0.4341 Epoch 4/50 160/160 - 0s - 3ms/step - accuracy: 0.8085 - loss: 0.4224 - val_accuracy: 0.8144 - val_loss: 0.4212 Epoch 5/50 160/160 - 0s - 3ms/step - accuracy: 0.8140 - loss: 0.4134 - val_accuracy: 0.8191 - val_loss: 0.4127 Epoch 6/50 160/160 - 1s - 4ms/step - accuracy: 0.8182 - loss: 0.4072 - val_accuracy: 0.8234 - val_loss: 0.4061 Epoch 7/50 160/160 - 1s - 3ms/step - accuracy: 0.8201 - loss: 0.4026 - val_accuracy: 0.8242 - val_loss: 0.4009 Epoch 8/50 160/160 - 1s - 3ms/step - accuracy: 0.8220 - loss: 0.3990 - val_accuracy: 0.8234 - val_loss: 0.3971 Epoch 9/50 160/160 - 1s - 4ms/step - accuracy: 0.8237 - loss: 0.3962 - val_accuracy: 0.8238 - val_loss: 0.3942 Epoch 10/50 160/160 - 0s - 3ms/step - accuracy: 0.8237 - loss: 0.3939 - val_accuracy: 0.8257 - val_loss: 0.3917 Epoch 11/50 160/160 - 0s - 3ms/step - accuracy: 0.8244 - loss: 0.3919 - val_accuracy: 0.8257 - val_loss: 0.3896 Epoch 12/50 160/160 - 1s - 4ms/step - accuracy: 0.8255 - loss: 0.3903 - val_accuracy: 0.8269 - val_loss: 0.3878 Epoch 13/50 160/160 - 1s - 4ms/step - accuracy: 0.8264 - loss: 0.3889 - val_accuracy: 0.8277 - val_loss: 0.3865 Epoch 14/50 160/160 - 0s - 3ms/step - accuracy: 0.8279 - loss: 0.3877 - val_accuracy: 0.8285 - val_loss: 0.3854 Epoch 15/50 160/160 - 1s - 4ms/step - accuracy: 0.8277 - loss: 0.3866 - val_accuracy: 0.8269 - val_loss: 0.3845 Epoch 16/50 160/160 - 1s - 4ms/step - accuracy: 0.8283 - loss: 0.3856 - val_accuracy: 0.8281 - val_loss: 0.3837 Epoch 17/50 160/160 - 1s - 8ms/step - accuracy: 0.8283 - loss: 0.3847 - val_accuracy: 0.8285 - val_loss: 0.3831 Epoch 18/50 160/160 - 1s - 7ms/step - accuracy: 0.8290 - loss: 0.3839 - val_accuracy: 0.8277 - val_loss: 0.3826 Epoch 19/50 160/160 - 1s - 3ms/step - accuracy: 0.8293 - loss: 0.3832 - val_accuracy: 0.8269 - val_loss: 0.3821 Epoch 20/50 160/160 - 0s - 3ms/step - accuracy: 0.8293 - loss: 0.3825 - val_accuracy: 0.8269 - val_loss: 0.3818 Epoch 21/50 160/160 - 1s - 3ms/step - accuracy: 0.8294 - loss: 0.3818 - val_accuracy: 0.8265 - val_loss: 0.3816 Epoch 22/50 160/160 - 1s - 3ms/step - accuracy: 0.8298 - loss: 0.3811 - val_accuracy: 0.8273 - val_loss: 0.3816 Epoch 23/50 160/160 - 1s - 3ms/step - accuracy: 0.8307 - loss: 0.3805 - val_accuracy: 0.8277 - val_loss: 0.3814 Epoch 24/50 160/160 - 1s - 4ms/step - accuracy: 0.8311 - loss: 0.3799 - val_accuracy: 0.8281 - val_loss: 0.3812 Epoch 25/50 160/160 - 1s - 4ms/step - accuracy: 0.8319 - loss: 0.3793 - val_accuracy: 0.8273 - val_loss: 0.3812 Epoch 26/50 160/160 - 1s - 4ms/step - accuracy: 0.8317 - loss: 0.3787 - val_accuracy: 0.8273 - val_loss: 0.3812 Epoch 27/50 160/160 - 1s - 3ms/step - accuracy: 0.8322 - loss: 0.3782 - val_accuracy: 0.8265 - val_loss: 0.3810 Epoch 28/50 160/160 - 0s - 3ms/step - accuracy: 0.8323 - loss: 0.3777 - val_accuracy: 0.8261 - val_loss: 0.3808 Epoch 29/50 160/160 - 1s - 3ms/step - accuracy: 0.8319 - loss: 0.3773 - val_accuracy: 0.8273 - val_loss: 0.3806 Epoch 30/50 160/160 - 1s - 3ms/step - accuracy: 0.8322 - loss: 0.3768 - val_accuracy: 0.8257 - val_loss: 0.3805 Epoch 31/50 160/160 - 1s - 3ms/step - accuracy: 0.8323 - loss: 0.3764 - val_accuracy: 0.8269 - val_loss: 0.3802 Epoch 32/50 160/160 - 0s - 3ms/step - accuracy: 0.8324 - loss: 0.3759 - val_accuracy: 0.8269 - val_loss: 0.3800 Epoch 33/50 160/160 - 0s - 3ms/step - accuracy: 0.8326 - loss: 0.3755 - val_accuracy: 0.8265 - val_loss: 0.3797 Epoch 34/50 160/160 - 1s - 4ms/step - accuracy: 0.8329 - loss: 0.3752 - val_accuracy: 0.8265 - val_loss: 0.3794 Epoch 35/50 160/160 - 1s - 4ms/step - accuracy: 0.8324 - loss: 0.3748 - val_accuracy: 0.8273 - val_loss: 0.3792 Epoch 36/50 160/160 - 1s - 4ms/step - accuracy: 0.8329 - loss: 0.3745 - val_accuracy: 0.8289 - val_loss: 0.3790 Epoch 37/50 160/160 - 1s - 8ms/step - accuracy: 0.8325 - loss: 0.3742 - val_accuracy: 0.8281 - val_loss: 0.3788 Epoch 38/50 160/160 - 1s - 4ms/step - accuracy: 0.8329 - loss: 0.3739 - val_accuracy: 0.8297 - val_loss: 0.3786 Epoch 39/50 160/160 - 0s - 3ms/step - accuracy: 0.8324 - loss: 0.3735 - val_accuracy: 0.8297 - val_loss: 0.3784 Epoch 40/50 160/160 - 1s - 4ms/step - accuracy: 0.8333 - loss: 0.3732 - val_accuracy: 0.8308 - val_loss: 0.3782 Epoch 41/50 160/160 - 0s - 3ms/step - accuracy: 0.8333 - loss: 0.3729 - val_accuracy: 0.8305 - val_loss: 0.3781 Epoch 42/50 160/160 - 1s - 4ms/step - accuracy: 0.8337 - loss: 0.3726 - val_accuracy: 0.8312 - val_loss: 0.3779 Epoch 43/50 160/160 - 1s - 3ms/step - accuracy: 0.8336 - loss: 0.3723 - val_accuracy: 0.8312 - val_loss: 0.3778 Epoch 44/50 160/160 - 0s - 3ms/step - accuracy: 0.8334 - loss: 0.3721 - val_accuracy: 0.8312 - val_loss: 0.3777 Epoch 45/50 160/160 - 1s - 3ms/step - accuracy: 0.8332 - loss: 0.3718 - val_accuracy: 0.8324 - val_loss: 0.3775 Epoch 46/50 160/160 - 1s - 3ms/step - accuracy: 0.8329 - loss: 0.3715 - val_accuracy: 0.8328 - val_loss: 0.3775 Epoch 47/50 160/160 - 1s - 3ms/step - accuracy: 0.8330 - loss: 0.3712 - val_accuracy: 0.8320 - val_loss: 0.3773 Epoch 48/50 160/160 - 0s - 3ms/step - accuracy: 0.8329 - loss: 0.3709 - val_accuracy: 0.8320 - val_loss: 0.3772 Epoch 49/50 160/160 - 0s - 3ms/step - accuracy: 0.8328 - loss: 0.3706 - val_accuracy: 0.8316 - val_loss: 0.3771 Epoch 50/50 160/160 - 1s - 4ms/step - accuracy: 0.8331 - loss: 0.3703 - val_accuracy: 0.8316 - val_loss: 0.3770
print("Time taken in seconds ",end-start)
Time taken in seconds 35.624032735824585
plot(history9,'loss')
plot(history9,'accuracy')
model9_train_perf = model_performance_classification(model9,X_btrain,y_btrain)
model9_train_perf
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.834282 | 0.834282 | 0.83431 | 0.834278 |
model9_val_perf = model_performance_classification(model9,X_bval,y_bval)
model9_val_perf
80/80 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.831633 | 0.831633 | 0.831782 | 0.831614 |
Observations:
- Model9 has training recall of 83.4% and validation recall of 83.1%
- Number of parameters is 281
- Train and validation recall performance are close and model generalises well on unseen data.
Model10: Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout¶
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7,1
model10 = Sequential()
#First Hidden Layer
model10.add(Dense(14,activation="relu",input_dim=X_btrain.shape[1]))
#Adding Dropout to prevent overfitting
model10.add(Dropout(0.5))
#Second Hidden Layer
model10.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model10.add(Dense(1,activation="sigmoid"))
model10.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dropout (Dropout) │ (None, 14) │ 0 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model10.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history10 = model10.fit(X_btrain, y_btrain, epochs=epochs, validation_data=(X_bval,y_bval),batch_size=batch_size,verbose=2)
end=time.time()
Epoch 1/50 160/160 - 5s - 31ms/step - accuracy: 0.5675 - loss: 0.6762 - val_accuracy: 0.7021 - val_loss: 0.6354 Epoch 2/50 160/160 - 2s - 11ms/step - accuracy: 0.6840 - loss: 0.6130 - val_accuracy: 0.7618 - val_loss: 0.5573 Epoch 3/50 160/160 - 1s - 3ms/step - accuracy: 0.7243 - loss: 0.5669 - val_accuracy: 0.7806 - val_loss: 0.5131 Epoch 4/50 160/160 - 1s - 3ms/step - accuracy: 0.7406 - loss: 0.5375 - val_accuracy: 0.7885 - val_loss: 0.4877 Epoch 5/50 160/160 - 1s - 4ms/step - accuracy: 0.7600 - loss: 0.5193 - val_accuracy: 0.7892 - val_loss: 0.4746 Epoch 6/50 160/160 - 1s - 4ms/step - accuracy: 0.7680 - loss: 0.5071 - val_accuracy: 0.7920 - val_loss: 0.4706 Epoch 7/50 160/160 - 0s - 3ms/step - accuracy: 0.7737 - loss: 0.4958 - val_accuracy: 0.7955 - val_loss: 0.4619 Epoch 8/50 160/160 - 1s - 4ms/step - accuracy: 0.7798 - loss: 0.4877 - val_accuracy: 0.7995 - val_loss: 0.4557 Epoch 9/50 160/160 - 1s - 4ms/step - accuracy: 0.7809 - loss: 0.4790 - val_accuracy: 0.8034 - val_loss: 0.4501 Epoch 10/50 160/160 - 1s - 3ms/step - accuracy: 0.7833 - loss: 0.4730 - val_accuracy: 0.8093 - val_loss: 0.4465 Epoch 11/50 160/160 - 1s - 3ms/step - accuracy: 0.7900 - loss: 0.4623 - val_accuracy: 0.8128 - val_loss: 0.4356 Epoch 12/50 160/160 - 1s - 4ms/step - accuracy: 0.7852 - loss: 0.4588 - val_accuracy: 0.8128 - val_loss: 0.4358 Epoch 13/50 160/160 - 0s - 3ms/step - accuracy: 0.7906 - loss: 0.4545 - val_accuracy: 0.8155 - val_loss: 0.4357 Epoch 14/50 160/160 - 1s - 3ms/step - accuracy: 0.7911 - loss: 0.4546 - val_accuracy: 0.8179 - val_loss: 0.4347 Epoch 15/50 160/160 - 1s - 4ms/step - accuracy: 0.7914 - loss: 0.4502 - val_accuracy: 0.8234 - val_loss: 0.4337 Epoch 16/50 160/160 - 1s - 5ms/step - accuracy: 0.7902 - loss: 0.4511 - val_accuracy: 0.8214 - val_loss: 0.4363 Epoch 17/50 160/160 - 1s - 4ms/step - accuracy: 0.7942 - loss: 0.4449 - val_accuracy: 0.8206 - val_loss: 0.4309 Epoch 18/50 160/160 - 1s - 4ms/step - accuracy: 0.7959 - loss: 0.4435 - val_accuracy: 0.8218 - val_loss: 0.4336 Epoch 19/50 160/160 - 1s - 4ms/step - accuracy: 0.7935 - loss: 0.4422 - val_accuracy: 0.8238 - val_loss: 0.4305 Epoch 20/50 160/160 - 1s - 7ms/step - accuracy: 0.7943 - loss: 0.4447 - val_accuracy: 0.8214 - val_loss: 0.4353 Epoch 21/50 160/160 - 1s - 4ms/step - accuracy: 0.7959 - loss: 0.4434 - val_accuracy: 0.8218 - val_loss: 0.4346 Epoch 22/50 160/160 - 1s - 4ms/step - accuracy: 0.7984 - loss: 0.4384 - val_accuracy: 0.8206 - val_loss: 0.4355 Epoch 23/50 160/160 - 0s - 3ms/step - accuracy: 0.7951 - loss: 0.4419 - val_accuracy: 0.8191 - val_loss: 0.4333 Epoch 24/50 160/160 - 1s - 4ms/step - accuracy: 0.7946 - loss: 0.4407 - val_accuracy: 0.8230 - val_loss: 0.4320 Epoch 25/50 160/160 - 1s - 3ms/step - accuracy: 0.7986 - loss: 0.4386 - val_accuracy: 0.8214 - val_loss: 0.4313 Epoch 26/50 160/160 - 1s - 4ms/step - accuracy: 0.8018 - loss: 0.4310 - val_accuracy: 0.8210 - val_loss: 0.4292 Epoch 27/50 160/160 - 1s - 3ms/step - accuracy: 0.7972 - loss: 0.4359 - val_accuracy: 0.8218 - val_loss: 0.4312 Epoch 28/50 160/160 - 1s - 4ms/step - accuracy: 0.8028 - loss: 0.4352 - val_accuracy: 0.8214 - val_loss: 0.4288 Epoch 29/50 160/160 - 1s - 4ms/step - accuracy: 0.7998 - loss: 0.4349 - val_accuracy: 0.8206 - val_loss: 0.4288 Epoch 30/50 160/160 - 1s - 4ms/step - accuracy: 0.8048 - loss: 0.4287 - val_accuracy: 0.8163 - val_loss: 0.4298 Epoch 31/50 160/160 - 1s - 4ms/step - accuracy: 0.8037 - loss: 0.4311 - val_accuracy: 0.8234 - val_loss: 0.4294 Epoch 32/50 160/160 - 0s - 3ms/step - accuracy: 0.8009 - loss: 0.4343 - val_accuracy: 0.8112 - val_loss: 0.4373 Epoch 33/50 160/160 - 0s - 3ms/step - accuracy: 0.8009 - loss: 0.4343 - val_accuracy: 0.8120 - val_loss: 0.4390 Epoch 34/50 160/160 - 1s - 4ms/step - accuracy: 0.8036 - loss: 0.4297 - val_accuracy: 0.8179 - val_loss: 0.4360 Epoch 35/50 160/160 - 1s - 4ms/step - accuracy: 0.7990 - loss: 0.4315 - val_accuracy: 0.8167 - val_loss: 0.4345 Epoch 36/50 160/160 - 1s - 4ms/step - accuracy: 0.7999 - loss: 0.4343 - val_accuracy: 0.8214 - val_loss: 0.4286 Epoch 37/50 160/160 - 1s - 8ms/step - accuracy: 0.7978 - loss: 0.4350 - val_accuracy: 0.8218 - val_loss: 0.4312 Epoch 38/50 160/160 - 1s - 5ms/step - accuracy: 0.8016 - loss: 0.4288 - val_accuracy: 0.8179 - val_loss: 0.4320 Epoch 39/50 160/160 - 1s - 6ms/step - accuracy: 0.8065 - loss: 0.4279 - val_accuracy: 0.8210 - val_loss: 0.4294 Epoch 40/50 160/160 - 1s - 4ms/step - accuracy: 0.8038 - loss: 0.4293 - val_accuracy: 0.8159 - val_loss: 0.4356 Epoch 41/50 160/160 - 0s - 3ms/step - accuracy: 0.8032 - loss: 0.4279 - val_accuracy: 0.8195 - val_loss: 0.4305 Epoch 42/50 160/160 - 0s - 3ms/step - accuracy: 0.8002 - loss: 0.4352 - val_accuracy: 0.8179 - val_loss: 0.4405 Epoch 43/50 160/160 - 0s - 3ms/step - accuracy: 0.8067 - loss: 0.4240 - val_accuracy: 0.8191 - val_loss: 0.4323 Epoch 44/50 160/160 - 0s - 3ms/step - accuracy: 0.7989 - loss: 0.4342 - val_accuracy: 0.8187 - val_loss: 0.4387 Epoch 45/50 160/160 - 1s - 4ms/step - accuracy: 0.8030 - loss: 0.4310 - val_accuracy: 0.8195 - val_loss: 0.4327 Epoch 46/50 160/160 - 1s - 3ms/step - accuracy: 0.8012 - loss: 0.4308 - val_accuracy: 0.8183 - val_loss: 0.4366 Epoch 47/50 160/160 - 0s - 3ms/step - accuracy: 0.8034 - loss: 0.4307 - val_accuracy: 0.8226 - val_loss: 0.4293 Epoch 48/50 160/160 - 0s - 3ms/step - accuracy: 0.7994 - loss: 0.4307 - val_accuracy: 0.8210 - val_loss: 0.4293 Epoch 49/50 160/160 - 1s - 3ms/step - accuracy: 0.8030 - loss: 0.4293 - val_accuracy: 0.8203 - val_loss: 0.4299 Epoch 50/50 160/160 - 0s - 3ms/step - accuracy: 0.8078 - loss: 0.4227 - val_accuracy: 0.8187 - val_loss: 0.4257
print("Time taken in seconds ",end-start)
Time taken in seconds 36.31879806518555
plot(history10,'loss')
plot(history10,'accuracy')
model10_train_perf = model_performance_classification(model10,X_btrain,y_btrain)
model10_train_perf
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.818878 | 0.818878 | 0.823452 | 0.818235 |
model10_val_perf = model_performance_classification(model10,X_bval,y_bval)
model10_val_perf
80/80 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.818681 | 0.818681 | 0.822922 | 0.818084 |
Observations:
- Model10 has training recall of 81.9% and validation recall of around 81.9%
- Number of parameters is 281
- Train and validation recall performance are close and model generalises well on unseen data.
Model Performance Comparison and Final Model Selection¶
# training performance comparison
models_train_comp_df = pd.concat(
[
model0_train_perf.T,
model1_train_perf.T,
model2_train_perf.T,
model5_train_perf.T,
model4_train_perf.T,
model5_train_perf.T,
model6_train_perf.T,
model7_train_perf.T,
model9_train_perf.T,
model8_train_perf.T,
model10_train_perf.T
],
axis=1,
)
models_train_comp_df.columns = [
"Neural Network (SGD without class weight,epochs=25)",
"Neural Network (SGD without class weight with momentum)",
"Neural Network (SGD without class weight,epochs=50)",
"Neural Network (SGD without batchsize)",
"Neural Network (SGD with batchsize)",
"Neural Network (SGD with batchsize,Different Number of Neurons[32,16,1]",
"Neural Network (Adam)",
"Neural Network (Adam, dropout [0.5])",
"Neural Network (SGD, Balanced Data SMOTE)",
"Neural Network (Adam, Balanced Data SMOTE)",
"Neural Network (Adam,Balanced Data SMOTE,dropout [0.5])"
]
# validation performance comparison
models_val_comp_df = pd.concat(
[
model0_val_perf.T,
model1_val_perf.T,
model2_val_perf.T,
model3_val_perf.T,
model4_val_perf.T,
model5_val_perf.T,
model6_val_perf.T,
model7_val_perf.T,
model8_val_perf.T,
model9_val_perf.T,
model10_val_perf.T
],
axis=1,
)
models_val_comp_df.columns = [
"Neural Network (SGD without class weight,epochs=25)",
"Neural Network (SGD without class weight with momentum)",
"Neural Network (SGD without class weight,epochs=50)",
"Neural Network (SGD without batchsize)",
"Neural Network (SGD with batchsize)",
"Neural Network (SGD with batchsize,Different Number of Neurons[32,16,1]",
"Neural Network (Adam)",
"Neural Network (Adam, dropout [0.5])",
"Neural Network (SGD, Balanced Data SMOTE)",
"Neural Network (Adam, Balanced Data SMOTE)",
"Neural Network (Adam,Balanced Data SMOTE,dropout [0.5])"
]
models_train_comp_df
| Neural Network (SGD without class weight,epochs=25) | Neural Network (SGD without class weight with momentum) | Neural Network (SGD without class weight,epochs=50) | Neural Network (SGD without batchsize) | Neural Network (SGD with batchsize) | Neural Network (SGD with batchsize,Different Number of Neurons[32,16,1] | Neural Network (Adam) | Neural Network (Adam, dropout [0.5]) | Neural Network (SGD, Balanced Data SMOTE) | Neural Network (Adam, Balanced Data SMOTE) | Neural Network (Adam,Balanced Data SMOTE,dropout [0.5]) | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Accuracy | 0.833281 | 0.862031 | 0.841406 | 0.790937 | 0.777500 | 0.790937 | 0.864219 | 0.857969 | 0.834282 | 0.828002 | 0.818878 |
| Recall | 0.833281 | 0.862031 | 0.841406 | 0.790937 | 0.777500 | 0.790937 | 0.864219 | 0.857969 | 0.834282 | 0.828002 | 0.818878 |
| Precision | 0.821079 | 0.852781 | 0.827612 | 0.843698 | 0.837294 | 0.843698 | 0.855599 | 0.848112 | 0.834310 | 0.828018 | 0.823452 |
| F1 Score | 0.801198 | 0.849874 | 0.820257 | 0.806151 | 0.794541 | 0.806151 | 0.851926 | 0.843839 | 0.834278 | 0.828000 | 0.818235 |
models_val_comp_df
| Neural Network (SGD without class weight,epochs=25) | Neural Network (SGD without class weight with momentum) | Neural Network (SGD without class weight,epochs=50) | Neural Network (SGD without batchsize) | Neural Network (SGD with batchsize) | Neural Network (SGD with batchsize,Different Number of Neurons[32,16,1] | Neural Network (Adam) | Neural Network (Adam, dropout [0.5]) | Neural Network (SGD, Balanced Data SMOTE) | Neural Network (Adam, Balanced Data SMOTE) | Neural Network (Adam,Balanced Data SMOTE,dropout [0.5]) | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Accuracy | 0.825000 | 0.857500 | 0.836250 | 0.773750 | 0.756875 | 0.777500 | 0.859375 | 0.858750 | 0.830848 | 0.831633 | 0.818681 |
| Recall | 0.825000 | 0.857500 | 0.836250 | 0.773750 | 0.756875 | 0.777500 | 0.859375 | 0.858750 | 0.830848 | 0.831633 | 0.818681 |
| Precision | 0.806742 | 0.847249 | 0.820166 | 0.833078 | 0.826982 | 0.833179 | 0.849714 | 0.849153 | 0.830938 | 0.831782 | 0.822922 |
| F1 Score | 0.790613 | 0.844355 | 0.814844 | 0.790960 | 0.776645 | 0.793943 | 0.845986 | 0.844590 | 0.830836 | 0.831614 | 0.818084 |
# Difference of training performance and Validation Performance for each model
models_train_comp_df.loc["Recall"] - models_val_comp_df.loc["Recall"]
| Recall | |
|---|---|
| Neural Network (SGD without class weight,epochs=25) | 0.008281 |
| Neural Network (SGD without class weight with momentum) | 0.004531 |
| Neural Network (SGD without class weight,epochs=50) | 0.005156 |
| Neural Network (SGD without batchsize) | 0.017187 |
| Neural Network (SGD with batchsize) | 0.020625 |
| Neural Network (SGD with batchsize,Different Number of Neurons[32,16,1] | 0.013437 |
| Neural Network (Adam) | 0.004844 |
| Neural Network (Adam, dropout [0.5]) | -0.000781 |
| Neural Network (SGD, Balanced Data SMOTE) | 0.003434 |
| Neural Network (Adam, Balanced Data SMOTE) | -0.003630 |
| Neural Network (Adam,Balanced Data SMOTE,dropout [0.5]) | 0.000196 |
- Final Model: Neural Network (Adam)
- Reasoning:
- Best Recall score on the training set (~86.4%), indicating strong performance in learning from the training data.
- Best Recall score on the validation set (~85.9%), demonstrating good performance in generalizing to unseen data.
- When considering validation scores only, Model 6 - Neural Network with Adam Optimizer outperforms others, suggesting its effectiveness in real-world applications and new data scenarios.
We'll go ahead with this model as our final model.
Let's rebuild it and check its performance across multiple metrics
Final Model¶
# defining the batch size and # epochs
epochs = 50
batch_size = 64
# clears the current Keras session, resetting all layers and models previously created, freeing up memory and resources.
tf.keras.backend.clear_session()
#Initializing the neural network of 14,7,1
model = Sequential()
#First Hidden Layer
model.add(Dense(14,activation="relu",input_dim=X_train.shape[1]))
#Second Hidden Layer
model.add(Dense(7,activation="relu"))
#output layer :Sigmoid is used for binary classification
#we have an output of 1 node, which is the the desired dimensions of our output whether the customer will exit or not
model.add(Dense(1,activation="sigmoid"))
model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩ │ dense (Dense) │ (None, 14) │ 168 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_1 (Dense) │ (None, 7) │ 105 │ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤ │ dense_2 (Dense) │ (None, 1) │ 8 │ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
Total params: 281 (1.10 KB)
Trainable params: 281 (1.10 KB)
Non-trainable params: 0 (0.00 B)
optimizer = tf.keras.optimizers.Adam() # defining Adam as the optimizer to be used
model.compile(loss='binary_crossentropy', optimizer=optimizer,metrics=['accuracy'])
start = time.time()
history = model.fit(X_train, y_train, epochs=epochs, validation_data=(X_val,y_val),batch_size=batch_size,verbose=2)
end=time.time()
Epoch 1/50 100/100 - 2s - 23ms/step - accuracy: 0.7844 - loss: 0.5472 - val_accuracy: 0.7962 - val_loss: 0.4842 Epoch 2/50 100/100 - 2s - 16ms/step - accuracy: 0.7966 - loss: 0.4637 - val_accuracy: 0.7962 - val_loss: 0.4524 Epoch 3/50 100/100 - 0s - 3ms/step - accuracy: 0.7975 - loss: 0.4375 - val_accuracy: 0.7950 - val_loss: 0.4337 Epoch 4/50 100/100 - 0s - 3ms/step - accuracy: 0.8012 - loss: 0.4192 - val_accuracy: 0.8050 - val_loss: 0.4199 Epoch 5/50 100/100 - 0s - 3ms/step - accuracy: 0.8195 - loss: 0.4047 - val_accuracy: 0.8206 - val_loss: 0.4088 Epoch 6/50 100/100 - 1s - 6ms/step - accuracy: 0.8306 - loss: 0.3932 - val_accuracy: 0.8300 - val_loss: 0.3991 Epoch 7/50 100/100 - 1s - 6ms/step - accuracy: 0.8361 - loss: 0.3836 - val_accuracy: 0.8281 - val_loss: 0.3910 Epoch 8/50 100/100 - 0s - 3ms/step - accuracy: 0.8384 - loss: 0.3758 - val_accuracy: 0.8338 - val_loss: 0.3845 Epoch 9/50 100/100 - 1s - 7ms/step - accuracy: 0.8414 - loss: 0.3696 - val_accuracy: 0.8394 - val_loss: 0.3796 Epoch 10/50 100/100 - 1s - 5ms/step - accuracy: 0.8430 - loss: 0.3651 - val_accuracy: 0.8406 - val_loss: 0.3760 Epoch 11/50 100/100 - 0s - 3ms/step - accuracy: 0.8473 - loss: 0.3615 - val_accuracy: 0.8413 - val_loss: 0.3733 Epoch 12/50 100/100 - 0s - 3ms/step - accuracy: 0.8492 - loss: 0.3587 - val_accuracy: 0.8450 - val_loss: 0.3711 Epoch 13/50 100/100 - 0s - 3ms/step - accuracy: 0.8517 - loss: 0.3565 - val_accuracy: 0.8469 - val_loss: 0.3692 Epoch 14/50 100/100 - 0s - 3ms/step - accuracy: 0.8523 - loss: 0.3547 - val_accuracy: 0.8487 - val_loss: 0.3678 Epoch 15/50 100/100 - 0s - 3ms/step - accuracy: 0.8531 - loss: 0.3532 - val_accuracy: 0.8500 - val_loss: 0.3666 Epoch 16/50 100/100 - 0s - 3ms/step - accuracy: 0.8533 - loss: 0.3519 - val_accuracy: 0.8494 - val_loss: 0.3657 Epoch 17/50 100/100 - 0s - 3ms/step - accuracy: 0.8542 - loss: 0.3506 - val_accuracy: 0.8500 - val_loss: 0.3647 Epoch 18/50 100/100 - 0s - 3ms/step - accuracy: 0.8547 - loss: 0.3494 - val_accuracy: 0.8481 - val_loss: 0.3638 Epoch 19/50 100/100 - 0s - 3ms/step - accuracy: 0.8559 - loss: 0.3483 - val_accuracy: 0.8481 - val_loss: 0.3631 Epoch 20/50 100/100 - 1s - 6ms/step - accuracy: 0.8567 - loss: 0.3473 - val_accuracy: 0.8506 - val_loss: 0.3623 Epoch 21/50 100/100 - 0s - 3ms/step - accuracy: 0.8573 - loss: 0.3463 - val_accuracy: 0.8500 - val_loss: 0.3617 Epoch 22/50 100/100 - 0s - 3ms/step - accuracy: 0.8577 - loss: 0.3454 - val_accuracy: 0.8512 - val_loss: 0.3609 Epoch 23/50 100/100 - 0s - 3ms/step - accuracy: 0.8583 - loss: 0.3444 - val_accuracy: 0.8525 - val_loss: 0.3601 Epoch 24/50 100/100 - 1s - 6ms/step - accuracy: 0.8586 - loss: 0.3435 - val_accuracy: 0.8525 - val_loss: 0.3594 Epoch 25/50 100/100 - 0s - 3ms/step - accuracy: 0.8584 - loss: 0.3427 - val_accuracy: 0.8519 - val_loss: 0.3589 Epoch 26/50 100/100 - 1s - 7ms/step - accuracy: 0.8594 - loss: 0.3418 - val_accuracy: 0.8525 - val_loss: 0.3584 Epoch 27/50 100/100 - 0s - 4ms/step - accuracy: 0.8597 - loss: 0.3411 - val_accuracy: 0.8531 - val_loss: 0.3578 Epoch 28/50 100/100 - 1s - 7ms/step - accuracy: 0.8597 - loss: 0.3405 - val_accuracy: 0.8525 - val_loss: 0.3574 Epoch 29/50 100/100 - 1s - 6ms/step - accuracy: 0.8594 - loss: 0.3399 - val_accuracy: 0.8531 - val_loss: 0.3570 Epoch 30/50 100/100 - 0s - 5ms/step - accuracy: 0.8591 - loss: 0.3393 - val_accuracy: 0.8544 - val_loss: 0.3566 Epoch 31/50 100/100 - 1s - 6ms/step - accuracy: 0.8592 - loss: 0.3388 - val_accuracy: 0.8544 - val_loss: 0.3562 Epoch 32/50 100/100 - 1s - 5ms/step - accuracy: 0.8600 - loss: 0.3382 - val_accuracy: 0.8544 - val_loss: 0.3559 Epoch 33/50 100/100 - 1s - 5ms/step - accuracy: 0.8598 - loss: 0.3378 - val_accuracy: 0.8537 - val_loss: 0.3555 Epoch 34/50 100/100 - 0s - 3ms/step - accuracy: 0.8600 - loss: 0.3373 - val_accuracy: 0.8544 - val_loss: 0.3552 Epoch 35/50 100/100 - 0s - 3ms/step - accuracy: 0.8600 - loss: 0.3369 - val_accuracy: 0.8544 - val_loss: 0.3548 Epoch 36/50 100/100 - 1s - 6ms/step - accuracy: 0.8602 - loss: 0.3364 - val_accuracy: 0.8550 - val_loss: 0.3545 Epoch 37/50 100/100 - 1s - 6ms/step - accuracy: 0.8605 - loss: 0.3360 - val_accuracy: 0.8550 - val_loss: 0.3542 Epoch 38/50 100/100 - 1s - 6ms/step - accuracy: 0.8611 - loss: 0.3356 - val_accuracy: 0.8550 - val_loss: 0.3539 Epoch 39/50 100/100 - 1s - 6ms/step - accuracy: 0.8611 - loss: 0.3352 - val_accuracy: 0.8544 - val_loss: 0.3537 Epoch 40/50 100/100 - 1s - 6ms/step - accuracy: 0.8606 - loss: 0.3348 - val_accuracy: 0.8537 - val_loss: 0.3534 Epoch 41/50 100/100 - 0s - 3ms/step - accuracy: 0.8606 - loss: 0.3344 - val_accuracy: 0.8537 - val_loss: 0.3532 Epoch 42/50 100/100 - 1s - 6ms/step - accuracy: 0.8605 - loss: 0.3341 - val_accuracy: 0.8550 - val_loss: 0.3529 Epoch 43/50 100/100 - 1s - 6ms/step - accuracy: 0.8608 - loss: 0.3337 - val_accuracy: 0.8556 - val_loss: 0.3527 Epoch 44/50 100/100 - 0s - 4ms/step - accuracy: 0.8612 - loss: 0.3334 - val_accuracy: 0.8556 - val_loss: 0.3525 Epoch 45/50 100/100 - 0s - 4ms/step - accuracy: 0.8608 - loss: 0.3330 - val_accuracy: 0.8556 - val_loss: 0.3524 Epoch 46/50 100/100 - 1s - 5ms/step - accuracy: 0.8603 - loss: 0.3326 - val_accuracy: 0.8556 - val_loss: 0.3523 Epoch 47/50 100/100 - 0s - 3ms/step - accuracy: 0.8608 - loss: 0.3323 - val_accuracy: 0.8562 - val_loss: 0.3521 Epoch 48/50 100/100 - 0s - 4ms/step - accuracy: 0.8611 - loss: 0.3319 - val_accuracy: 0.8562 - val_loss: 0.3520 Epoch 49/50 100/100 - 1s - 5ms/step - accuracy: 0.8609 - loss: 0.3316 - val_accuracy: 0.8562 - val_loss: 0.3519 Epoch 50/50 100/100 - 0s - 3ms/step - accuracy: 0.8609 - loss: 0.3313 - val_accuracy: 0.8550 - val_loss: 0.3517
print("Time taken in seconds ",end-start)
Time taken in seconds 25.94053626060486
plot(history,'loss')
plot(history,'accuracy')
model_train_perf = model_performance_classification(model,X_train,y_train)
model_train_perf
200/200 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.862344 | 0.862344 | 0.853659 | 0.848788 |
model_val_perf = model_performance_classification(model,X_val,y_val)
model_val_perf
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.855 | 0.855 | 0.845015 | 0.838641 |
model_test_perf = model_performance_classification(model,X_test,y_test)
model_test_perf
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step
| Accuracy | Recall | Precision | F1 Score | |
|---|---|---|---|---|
| 0 | 0.8615 | 0.8615 | 0.853022 | 0.846844 |
y_train_pred = model.predict(X_train)
# Predicting on val
y_val_pred = model.predict(X_val)
# Predicting on test
y_test_pred=model.predict(X_test)
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step 50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
print("Classification Report - Training data",end="\n\n")
cr = classification_report(y_train,y_train_pred>0.5)
print(cr)
Classification Report - Training data
precision recall f1-score support
0 0.88 0.96 0.92 5096
1 0.77 0.47 0.58 1304
accuracy 0.86 6400
macro avg 0.82 0.71 0.75 6400
weighted avg 0.85 0.86 0.85 6400
print("Classification Report - Validation data",end="\n\n")
cr = classification_report(y_val,y_val_pred>0.5)
print(cr)
Classification Report - Validation data
precision recall f1-score support
0 0.87 0.96 0.91 1274
1 0.76 0.43 0.55 326
accuracy 0.85 1600
macro avg 0.81 0.70 0.73 1600
weighted avg 0.85 0.85 0.84 1600
print("Classification Report - Test data",end="\n\n")
cr = classification_report(y_test,y_test_pred>0.5)
print(cr)
Classification Report - Test data
precision recall f1-score support
0 0.87 0.97 0.92 1593
1 0.77 0.45 0.57 407
accuracy 0.86 2000
macro avg 0.82 0.71 0.74 2000
weighted avg 0.85 0.86 0.85 2000
score = model.evaluate(X_test, y_test)
63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8708 - loss: 0.3128
print("Test Loss:", score[0])
print("Test Accuracy:", score[1])
Test Loss: 0.3386223614215851 Test Accuracy: 0.8615000247955322
The weighted recall score on the test data is ~0.86
A recall score of ~0.86 indicates that model is classifying instancess with minimal false negatives
Model can be further tuned to deal with minority class.
Actionable Insights and Business Recommendations¶
The Bank can deploy the final model from this exercise to identify with a reasonable degree of accuracy whether a customer is likely to leave the bank or will remain cutomer, and this process seems to be easier and more time-efficient than other methods.
Bank to provide better services or promotions to retain female customers as it shows that female Bank customers are more likely to churn than male customers.
Business leader to come up with strategies how the bank can win more customers in Spain and Germany.
Bank to come up with promotions,strategies to retain customers in Germany as there are more customers exiting in Germany than in Spain or France.
Power Ahead ___